DISCRETE and
COMBINATORIAL
WATE Bs OS Uae 8 Ot aed
im aYi ep 1"
Ralph P. Grimaldi any
DISCRETE
AND
COMBINATORIAL
MATHEMATICS
An Applied Introduction
FIFTH EDITION
RALPH P. GRIMALDI
Rose-Hulman Institute of Technology
A
vv
PEARSON
ee
File a
Aer (ent
Boston San Francisco New York
London Toronto Sydney Tokyo Singapore Madrid >
Mexico City Munich Paris Cape Town Hong Kong Montreal s%
Publisher: Greg Tobin
Senior Acquisitions Editor: William Hoffman
Assistant Editor: RoseAnne Johnson
Executive Marketing Manager: Yolanda Cossio
Senior Marketing Manager: Pamela Laskey
Marketing Assistant: Heather Peck
Managing Editor: Karen Guardino
Senior Production Supervisor: Peggy McMahon
Senior Manufacturing Buyer: Hugh Crawford
Composition and Technical Art Rendering: Techsetters, Inc.
Production Services: Barbara Pendergast
Design Supervisor: Barbara T. Atkinson
Cover Designer: Dennis Schaefer
Photo Research and Design Specifications: Beth Anderson
Cover Illustration: George V. Kelvin
Photographs of Blaise Pascal, Aristotle, Lord Bertrand Arthur William Russell, Euclid, Au-
gusta Ada Byron (Countess of Lovelace), Gottfried Wilhelm Leibniz, Carl Friedrich Gauss,
Leonhard Euler, Arthur Cayley, Pierre de Fermat, Niels Henrik Abel, and Evariste Galois
are reproduced courtesy of the Bettman Archive (Corbis). Photographs of George Boole,
Peter Gustav Lejeune Dirichlet, David Hilbert, Giuseppe Peano, James Joseph Sylvester,
Sophie Germain, and Emmy Noether are reproduced courtesy of Historical Pictures/Stock
Montage. The photograph of Claude Elwood Shannon is reproduced courtesy of the MIT
Museum. The photograph of Edsger W. Dijkstra is reproduced courtesy of the University of
Texas at Austin. The photographs of Andrew John Wiles and Rear Admiral Grace Murray
Hopper are reproduced courtesy of AP/Wide World. The photographs of Georg Cantor,
Alan Mathison Turing, William Rowan Hamilton, and Leonardo Fibonacci are reproduced
courtesy of The Granger Collection. The photograph of Paul Erdés is reproduced courtesy
of Christopher Barker. The photographs of Andrei Nikolayevich Kolmogorov, Thomas
Bayes, and Al-Khowarizmi are reproduced courtesy of the St. Andrews University Mac-
Tutor Archive. The photograph of David A. Huffman is reproduced courtesy of Manuel
Enrique Bermudez of the Department of Computer and Information Science and Engi-
neering at the University of Florida. The photograph of Joseph P. Kruskal is reproduced
courtesy of Leiden University.
Library of Congress Cataloging-in-Publication Data
Grimaldi, Ralph P.
A review of discrete and combinatorial mathematics / by Ralph P. Grimaldi.—Sth ed.
p. cm.
Includes index.
Rev. ed of: Discrete and combinatorial mathematics, c1999.
ISBN 0-201-72634-3
1. Mathematics. 2. Computer science-Mathematics. 3. Combinatorial analysis. 1.
Grimaldi, Ralph P. Discrete and combinatorial mathematics. II. Title.
QA39.2.G748 2003
510-de21
2002038383
ISBN 0-201 -72634-3
permission of the publisher. Printed in the United States of America.
123456789
10— CRW — 0504030202
NOTATION
LOGIC P.q statements (or propositions)
Pp the negation of (statement) p: not p
DAG the conjunction of p, g: p and q
pV@ the disjunction of p, g: p or q
p->@ the implication of g by p: p implies q
pq the biconditional of p and q: p if and only if q
iff if and only if
p>4q logical implication: p logically implies q
pod logical equivalence: p is logically equivalent to q
To tauology
Fo contradiction
Vx For all x (the universal quantifier)
dx For some x (the existential quantifier)
SET THEORY xéEA element x is a member of set A
xGA element x is not a member of set A
OU the universal set
ACB,BDA A is a subset of B
ACB,BDA A is a proper subset of B
AZB A is not a subset of B
ACB A is not a proper subset of B
[A] the cardinality, or size, of set A — that is, the number of elements in A
B={} the empty, or null, set
P(A) the power set of A — that is, the collection of all subsets of A
ANB the intersection of sets A, B: {x|x € A and x € B}
AUB the union of sets A, B: {x|x € Aorx € B}
AAB the symmetric difference of sets A, B:
{x|x € Aorx € B, butx ¢ AM B}
the complement of set A: {x|x € U and x ¢ A}
the (relative) complement of set B in set A: {x|x € A and x ¢ B}
{x|x € A,, for at least one i € I}, where J is an index set
{x|x € A,, for every i € J}, where J is an index set
PROBABILITY the sample space for an experiment @
A is an event
the probability of event A
the probability of A given B; conditional probability
random variable
the expected value of X, a random variable
the variance of X, a random variable
the standard deviation of X, a random variable
NUMBERS alb a divides b, fora, be Z,a #0
afb a does not divide b, for a, b € Z,a #0
gcd(a, b) the greatest common divisor of the integers a, b
Icm(a, b) the least common multiple of the integers a, b
$ (n) Euler’s phi function for n € Zt
[x] the greatest integer less than or equal to the real number x:
the greatest integer in x: the floor of x
NOTATION
[x] the smallest integer greater than or equal to the real number x:
the ceiling of x
a =b(modn) a is congruent to b modulo n
RELATIONS AXB the Cartesian, or cross, product of sets A, B:
{(a, b)ja € A, b & B}
RCOAXB SR is arelation from A to B
aRb; (a, byEeR a is related to b
afb; (a, byeR a is not related to b
GRE
the converse of relation &: (a, b) € AK iff (b, a) E RS
Roy the composite relation for RA CAX B,PSCBXC:
(a,c) ERoF if (a, b) eR, (b,c) € F forsomebeB
lub{a, b} the least upper bound of a and b
glb{a, b} the greatest lower bound of a and b
[a] the equivalence class of element a (relative to an
equivalence relation R on a set A): {x € Alx Ra}
FUNCTIONS f:A>B f is a function from A to B
F(A) for f: A—> Band A; CA, f(A}) is the image of A
under f — that is, { f(a)|a € A,}
F(A) for f: A— B, f(A) is the range of f
f:AXA—>B f is a binary operation on A
f:AXA— B(CA) f is aclosed binary operation on A
ly: AoA the identity function on A: 14(a) =a foreachae A
flay the restriction of f: A> Bto A; CA
gos the composite function for f: A > B,g:B—->C:
fi
(g° f)a = g(f(a)),
forae A
the inverse of function f
f-'(B)) the preimage of B, C B for f: A—> B
f € O(g) f is “big Oh” of g; f is of order g
THE ALGEBRA x a finite set of symbols called an alphabet
OF STRINGS Xr the empty string
thal} the length of string x
¥r"
{xjxX2 +++ Xnlx, € L},n Ee Zt
yo
{A}
yt
U ez+ x”: the set of all strings of positive length
y*
U 20 =”: the set of all finite strings
Acz* A is a language
AB the concatenation of languages A, B C *:
{abla e A, be B}
A”
{a\a2---a,|a,€ ACXU*}, ne Zt
Ae
{A}
At
Un ez+A"
A*®
Uso A": the Kleene closure of language A
M =(S,,0, v, w) a finite state machine M with internal states S$, input
alphabet Y, output alphabet ©, next state function
v: S X F + S and output function w: SX f > O
Preface
[: has been more than twenty years since September 2, 1982, when I signed the contract
to develop what turned into the first edition of this present textbook. At that time the
idea of further editions never crossed my mind. Consequently, I continue to find myself
simultaneously very humbled and very pleased with the way this textbook has been received
by so many instructors and especially students. The first four editions of this textbook have
found their way into many colleges and universities here in the United States. They have
also been used in other nations such as Australia, Canada, England, Ireland, Japan, Mexico,
the Netherlands, Scotland, Singapore, South Africa, and Sweden. I can only hope that this
fifth edition will continue to enlighten and challenge all those who wish to learn about some
of the many facets of the fascinating area of mathematics called discrete mathematics.
The technological advances of the last four decades have resulted in many changes
in the undergraduate curriculum. These changes have fostered the development of many
single-semester and multiple-semester courses where some of the following are introduced:
1. Discrete methods that stress the finite nature inherent in many problems and structures;
2. Combinatorics — the algebra of enumeration, or counting, with its fascinating inter-
relations with so many finite structures;
3. Graph theory with its applications and interrelations with areas such as data structures
and methods of optimization; and
4. Finite algebraic structures that arise in conjunction with disciplines such as coding
theory, methods of enumeration, gating networks, and combinatorial designs.
A primary reason for studying the material in any or all of these four major topics is the
abundance of applications one finds in the study of computer science — especially in the
areas of data structures, the theory of computer languages, and the analysis of algorithms.
In addition, there are also applications in engineering and the physical and life sciences, as
well as in statistics and the social sciences. Consequently, the subject matter of discrete and
combinatorial mathematics provides valuable material for students in many majors — not
just for those majoring in mathematics or computer science.
The major purpose of this new edition is to continue to provide an introductory survey
in both discrete and combinatorial mathematics. The coverage is intended for the beginning
student, so there are a great number of examples with detailed explanations. (The examples
are numbered separately and a thick line is used to denote the end of each example.) In
addition, wherever proofs are given, they too are presented with sufficient detail (with the
novice in mind).
Preface
The text strives to accomplish the following objectives:
1. To introduce the student at the sophomore-junior level, if not earlier, to the topics and
techniques of discrete methods and combinatorial reasoning. Problems in counting, or
enumeration, require a careful analysis of structure (for example, whether or not order
and repetition are relevant) and logical possibilities. There may even be a question of
existence for some situations. Following such a careful analysis, we often find that the
solution of a problem requires simple techniques for counting the possible outcomes that
evolve from the breakdown of the given problem into smaller subproblems.
2. To introduce a wide variety of applications. In this regard, whenever data structures
(from computer science) or structures from abstract algebra are required, only the basic
theory needed for the application is developed. Furthermore, the solutions of some ap-
plications lend themselves to iterative procedures that lead to specific algorithms. The
algorithmic approach to the solution of problems is fundamental in discrete mathemat-
ics, and this approach reinforces the close ties between this discipline and the area of
computer science.
3. To develop the mathematical maturity of the student through the study of an area that
is so different from the traditional coverage in calculus and differential equations. Here,
for example, there is the opportunity to establish results by counting a certain collection
of objects in more than one way. This provides what are called combinatorial identities;
it also introduces a novel proof technique. In this edition the nature of proof, along with
what constitutes a valid argument, is developed in Chapter 2, in conjunction with the
laws of logic and rules of inference. The coverage is extensive, keeping the student
(with minimal background) in mind. [For the reader with a logic course (or something
comparable) in his or her background, this material can be skipped over with little or
no difficulty.] Proofs by mathematical induction (along with recursive definitions) are
introduced in Chapter 4 and then used throughout the subsequent chapters.
With regard to theorems and their proofs, in many instances an attempt has been made
to motivate theorems from observations on specific examples. In addition, whenever a
finite situation provides a result that is not true for the infinite case, this situation is
singled out for attention. Proofs that are extremely long and/or rather special in nature
are omitted. However, for the very small number of proofs that are omitted, references are
supplied for the reader interested in seeing the validation of these results. (The amount
of emphasis placed on proofs will depend on the goals of the individual instructor and
on those of his or her student audience.)
4. To present an adequate survey of topics for the computer science student who will be
taking more advanced courses in areas such as data structures, the theory of computer
languages, and the analysis of algorithms. The coverage here on groups, rings, fields,
and Boolean algebras will also provide an applied introduction for mathematics majors
who wish to continue their study of abstract algebra.
The prerequisites for using this book are primarily a sound background in high school
mathematics and an interest in attacking and solving a variety of problems. No particular
programming ability is assumed. Program segments and procedures are given in pseudo-
code, and these are designed and explained in order to reinforce particular examples. With
regard to calculus, we shall mention later in this preface its extent in Chapters 9 and 10.
My primary motivation for writing the first four editions of this book has been the en-
couragement I had received over the years from my students and colleagues, as well as from
the students and instructors who used the first four editions of the textbook at many different
colleges and universities. Those four editions reflected both my interests and concerns and
Preface vii
those of my students, as well as the recommendations of the Committee on the Undergrad-
uate Program in Mathematics and of the Association of Computing Machinery. This fifth
edition continues along the same lines, reflecting the suggestions and recommendations
made by the instructors and especially the students who have used or are using the fourth
edition.
Features
Following are brief descriptions of some of the major features of this newest edition. These
are designed to assist the reader (student or otherwise) in learning the fundamentals of
discrete and combinatorial mathematics.
Emphasis on algorithms and applications. Algorithms and applications in many areas
are presented throughout the text. For example:
1. Chapter | includes several instances where the introductory topics on enumeration
are needed — one example, in particular, addresses the issue of over-counting.
2. Section 7 of Chapter 5 provides an introduction to computational complexity. This
material is then used in Section 8 of this chapter in order to analyze the running times of
some elementary pseudocode procedures.
3. The material in Chapter 6 covers languages and finite state machines. This introduces
the reader to an important area in computer science — the theory of computer languages.
4, Chapters 7 and 12 include discussions on the applications and algorithms dealing with
topological sorting and the searching techniques known as the depth-first search and the
breadth-first search.
5. In Chapter 10 we find the topic of recurrence relations. The coverage here includes ap-
plications on (a) the bubble sort, (b) binary search, (c) the Fibonacci numbers,
(d) the Koch snowflake, (e) Hasse diagrams, (f) the data structure called the stack,
(g) binary trees, and (h) tilings.
6. Chapter 16 introduces the fundamental properties of the algebraic structure called
the group. The coverage here shows how this structure is used in the study of algebraic
coding theory and in counting problems that require Polya’s method of enumeration.
Detailed explanations. Whether it is an example or the proof of a theorem, explana-
tions are designed to be careful and thorough. The presentation is primarily focused on
improving understanding on the part of the reader who is seeing this type of material for
the first time.
Exercises. The role of the exercises in any mathematics text is a crucial one. The amount
of time spent on the exercises greatly influences the pace of the course. Depending on
the interest and mathematical background of the student audience, an instructor should
find that the class time spent on discussing exercises will vary.
There are over 1900 exercises in the 17 chapters. Those that appear at the end of each
section generally follow the order in which the section material is developed. These
exercises are designed to (a) review the basic concepts in the section; (b) tie together
ideas presented in earlier sections of the chapter; and (c) introduce additional concepts
that are related to the material in the section. Some exercises call for the development
of an algorithm, or the writing of a computer program, often to solve a certain instance
of a general problem. These usually require only a minimal amount of programming
experience.
viii Preface
Each chapter concludes with a set of supplementary exercises. These provide further
review of the ideas presented in the chapter, and also use material developed in earlier
chapters.
Solutions are provided at the back of the text for almost all parts of all the odd-
numbered exercises.
Chapter summaries. The last numbered section in each chapter provides a summary
and historical review of the major ideas covered in that chapter. This is intended to give
the reader an overview of the contents of the chapter and provide information for further
study and applications. Such further study can be readily assisted by the list of references
that is supplied.
In particular, the summaries at the ends of Chapters 1, 5, and 9 include tables on the
enumeration formulas developed within each of these chapters. Sometimes these tables
include results from earlier chapters in order to make comparisons and to show how the
new results extend the prior ones.
Organization
The areas of discrete and combinatorial mathematics are somewhat new to the undergraduate
curriculum, so there are several options as to which topics should be covered in these courses.
Each instructor and each student may have different interests. Consequently, the coverage
here is fairly broad, as a survey course mandates. Yet there will always be further topics that
some readers may feel should be included. Furthermore, there will also be some differences
of opinion with regard to the order in which some topics are presented in this text.
The nature and importance of the algorithmic approach to problem solving is stressed
throughout the text. Ideas and approaches on problem solving are further strengthened by
the interrelations between enumeration and structure, two other major topics that provide
unifying threads for the material developed in the book.
The material is subdivided into four major areas. The first seven chapters form the
underlying core of the book and present the fundamentals of discrete mathematics. The
coverage here provides enough material for a one-quarter or one-semester course in discrete
mathematics. The material in Chapter 2 can be reviewed by those with a background in logic.
For those interested in developing and writing proofs, this material should be examined
very carefully. A second course —one that emphasizes combinatorics — should include
Chapters 8, 9, and 10 (and, time permitting, sections 1, 2,3, 10, 11, and 12 of Chapter 16). In
Chapter 9 some results from calculus are used; namely, fundamentals on differentiation and
partial fraction decompositions. However, for those who wish to skip this chapter, sections
1, 2, 3, 6, and 7 of Chapter 10 can still be covered. A course that emphasizes the theory and
applications of finite graphs can be developed from Chapters 11, 12, and 13. These chapters
form the third major subdivision of the text. For a course in applied algebra, Chapters
14, 15, 16, and 17 (the fourth, and final, subdivision) deal with the algebraic structures
—
group, ring, Boolean algebra, and field — and include applications on cryptology, switching
functions, algebraic coding theory, and combinatorial designs. Finally, a course on the role
of discrete structures in computer science can be developed from the material in Chapters
1], 12, 13, 15, and sections 1-9 of Chapter 16. For here we find applications on switching
functions, the RSA cryptosystem, and algebraic coding theory, as well as an introduction
to graph theory and trees, and their role in optimization.
Other possible courses can be developed by considering the following chapter depen-
dencies.
Preface IX
Chapter Dependence on Prior Chapters
1 No dependence
2 No dependence (Hence an instructor can start a course in discrete mathe-
matics with either the study of logic or an introduction to enumeration.)
, 2
Mm HW
1,2
+ , 3
1,2, a) 3,4
Oo
,
ony
1, 2, 3 3, 5 (Minor dependence in Section 6.1 on Sections 4.1, 4.2)
AN
+
1, 2, 3, 5, 6 (Minor dependence in Section 7.2 on Sections 4.1, 4.2)
1, 3 (Minor dependence in Example 8.6 on Section 5.3)
1,3
1, 3, 4, 5, 9 (Minor dependence in Example 10.33 on Section 7.3)
—
1,2, 3,4, 5 (Although some graph-theoretic ideas are mentioned in Chapters
—
=
5,6, 7, 8, and 10, the material in this chapter is developed with no dependence
on the graph-theoretic material given in these earlier results.)
12 1,2, 3,4,5, 11
13 3,5, 11, 12
14 2,3, 4, 5, 7 (The Euler phi function (@) is used in Section 14.3. This function
is derived in Example 8.8 of Section 8.1 but the result can be used here in
Chapter 14 without covering Chapter 8.)
15 2,3,5,7
16 1, 2,3,4,5,7
17 2, 3,4, 5, 7, 14
In addition, the index has been very carefully developed in order to make the text even
more flexible. Terms are presented with primary listings and several secondary listings.
Also there is a great deal of cross referencing. This is designed to help the instructor who
may want to change the order of presentation and deviate from the straight and narrow.
Changes in the Fifth Edition
The changes here in the fifth edition of Discrete and Combinatorial Mathematics reflect
the observations and recommendations of students and instructors who have used earlier
editions of the text. As with the first four editions, the tone and purpose of the text remain
intact. The author’s goal is still the same: to provide within these pages a sound, readable,
and understandable introduction to the foundations of discrete and combinatorial mathe-
matics — for the beginning student or reader. Among the changes one will find in this fifth
edition we mention the following:
@ The examples in Section 4 of Chapter 1 now include material on runs, a concept that
arises in the study of statistics — in particular, in the area of quality control.
e Exercise 13 for Section 3 of Chapter 2 develops the rule of inference known as reso-
lution, a rule that serves as the basis for many computer programs designed to automate
a reasoning system.
e The earlier editions of this text included a section that introduced the notion of prob-
ability. This section has now been expanded and three additional optional sections have
been added for those who wish to further examine some of the introductory ideas as-
sociated with discrete probability — in particular, the axioms of probability, conditional
probability, independence, Bayes’ Theorem, and discrete random variables.
Preface
@ The coverage on partial orders and total orders in Section 3 of Chapter 7 now includes
an optional example where the Catalan numbers arise in this context.
e The introductory material in Section 1] of Chapter 8 has been rewritten to provide
a more readable transition between the coverage on counting and Venn diagrams in Sec-
tion 3 of Chapter 3 and the more general technique known as the Principle of Inclusion
and Exclusion.
® One of the fascinating features of discrete and combinatorial mathematics is the vari-
ety of ways a given problem can be solved. In the fourth edition (in Chapters 1 and 3)
the reader learned, in two different contexts, that a positive integer n had 2”—! compo-
sitions — that is, there are 2”—! ways to write n as an ordered sum of positive-integer
summands. This result is now established in three other ways: (i) by the Principle of
Mathematical Induction in Chapter 4; (ii) using generating functions in Chapter 9; and
(iii) by solving a recurrence relation in Chapter 10.
e For those who want even more on discrete probability, Section 2 of Chapter 9 includes
an example that deals with the geometric random variable.
® Section 2 of Chapter 10 now includes a discussion of the work by Gabrie] Lamé in
estimating the number of divisions used in the Euclidean algorithm to find the greatest
common divisor of two positive integers.
¢ The Master theorem (of importance in the analysis of algorithms) is introduced and
developed in an exercise for Section 6 of Chapter 10.
e The material on transport networks (in Section 3 of Chapter 13) has been updated and
now incorporates the Edmonds-Karp algorithm in the procedure originally developed by
Lester Ford and Delbert Fulkerson.
e The coverage on modular arithmetic in Section 3 of Chapter 14 now includes applica-
tions dealing with the linear congruential pseudorandom number generator, private-key
cryptosystems, and modular exponentiation. Further, in Section 4 of Chapter 14, the ma-
terial dealing with the Chinese Remainder Theorem, which was only stated in previous
editions, now includes a proof of this result as well as an example dealing with how it is
applied.
® Section 4 of Chapter 16 is new and optional. The material here provides an introduction
to the RSA public-key cryptosystem and shows how one can apply some of the theoretical
results developed in prior sections of the text.
e As with the second, third, and fourth editions, a great deal of effort has been applied
in updating the summary and historical review at the end of each chapter. Consequently,
new references and/or new editions are provided where appropriate.
e For this fifth edition, the following pictures and photographs have been added to the
summary and historical review of certain chapters: a picture of Thomas Bayes and a pho-
tograph of Andrei Nikolayevich Kolmogorov in Chapter 3; a picture of Al-Khow4rizmi
in Chapter 4; a photograph of David A. Huffman in Chapter 12; and a photograph of
Joseph B. Kruskal in Chapter 13.
Ancillaries
e@ There is an /nstructor’s Solutions Manual that is available, from the publisher, for
those instructors who adopt the textbook for their classes. It contains the solutions and/or
answers for all of the exercises within the 17 chapters and the three appendices of this
textbook.
Preface xi
@ There is also a Student’s Solutions Manual that is available separately. It contains the
solutions and/or answers for all of the odd-numbered exercises in the textbook. In some
cases more than one solution is presented.
e The following Web site provides additional resources for learning more about discrete
and combinatorial mathematics. In addition it also provides a way for readers to contact
the author with comments, suggestions, or possible errors they have found.
www.aw.com/grimaldi
Acknowledgments
If space permitted, I should like to mention each of the students who provided help and
encouragement when I was writing the five editions of this book. Their suggestions helped
to remove many mistakes and ambiguities, thus improving the exposition. Most helpful
in this category were Paul Griffith, Meredith Vannauker, Paul Barloon, Byron Bishop,
Lee Beckham, Brett Hunsaker, Tom Vanderlaan, Michael Bryan, John Breitenbach, Dan
Johnson, Brian Wilson, Allen Schneider, John Dowell, Charles Wilson, Richard Nichols,
Charles Brads, Jonathan Atkins, Kenneth Schmidt, Donald Stanton, Mark Stremler, Stephen
Smalley, Anthony Hinrichs, Kevin O’ Bryant, and Nathan Terpstra.
I thank Larry Alldredge, Claude Anderson, David Rader, Matt Hopkins, John Rickert, and
Martin Rivers for their comments on the computer science material, and Barry Farbrother,
Paul Hogan, Dennis Lewis, Charles Kyker, Keith Hoover, Matthew Saltzman, and Jerome
Wagner for their enlightening remarks on some of the applications.
I gratefully acknowledge the persistent enthusiasm and encouragement of the staff at
Addison-Wesley (both past and present), especially Wayne Yuhasz, Thomas Taylor, Michael
Payne, Charles Glaser, Mary Crittendon, Herb Merritt, Maria Szmauz, Adeline Ruggles,
Stephanie Botvin, Jack Casteel, Jennifer Wall, Joanne Sousa Foster, Karen Guardino, Peggy
McMahon, Deborah Schneider, Laurie Rosatone, Carolyn Lee-Davis, and Jennifer Al-
banese. William Hoffman, and especially RoseAnne Johnson and Barbara Pendergast, de-
serve the most recognition for their outstanding contributions to this fifth edition. The efforts
put forth by Steven Finch in proofreading the text and that of Paul Lorczak who checked
the accuracy of the answers to the exercises are also greatly appreciated.
I am also indebted to my colleagues John Kinney, Robert Lopez, Allen Broughton,
Gary Sherman, George Berzsenyi, and especially Alfred Schmidt, for their interest and
encouragement throughout the writing of this and/or earlier editions.
Thanks and appreciation are due the following reviewers of the first, second, third, fourth,
and/or fifth editions.
Norma E. Abel Digital Equipment Corporation
Larry Alldredge Qualcomm, Inc.
Charles Anderson University of Colorado, Denver
Claude W. Anderson III Rose-Hulman Institute of Technology
David Arnold Baylor University
V. K. Balakrishnan University of Maine at Orono
Robert Barnhill University of Utah
Dale Bedgood East Texas State University
Jerry Beehler Tri-State University
Katalin Bencsath Manhattan College
Allan Bishop Western Illinois University
Monte Boisen Virginia Polytechnic Institute
xii Preface
Samuel Councilman California State University at Long Beach
Robert Crawford Western Kentucky University
Ellen Cunningham, SP Saint Mary-of-the-Woods College
Carl DeVito Naval Postgraduate School
Vladimir Drobot San Jose State University
John Dye California State University at Northridge
Carl Eckberg San Diego State University
Michael Falk Northern Arizona University
Marvin Freedman Boston University
Robert Geitz Oberlin College
James A. Glasenapp Rochester Institute of Technology
Gary Gordon Lafayette College
Harvey Greenberg University of Colorado, Denver
Laxmi Gupta Rochester Institute of Technology
Eleanor O. Hare Clemson University
James Harper Central Washington University
David S. Hart Rochester Institute of Technology
Maryann Hastings Marymount College
W. Mack Hill Worcester State College
Stephen Hirtle University of Pittsburgh
Arthur Hobbs Texas A&M University
Dean Hoffman Auburn University
Richard Iltis Willamette University
David P. Jacobs Clemson University
Robert Jajcay Indiana State University
Akihiro Kanamori Boston University
John Konvalina University of Nebraska at Omaha
Rochelle Leibowitz Wheaton College
James T. Lewis University of Rhode Island
Y-Hsin Liu University of Nebraska at Omaha
Joseph Malkevitch York College (CUNY)
Brian Martensen The University of Texas at Austin
Hugh Montgomery University of Michigan
Thomas Morley Georgia Institute of Technology
Richard Orr Rochester Institute of Technology
Edwin P. Oxford Baylor University
John Rausen New Jersey Institute of Technology
Martin Rivers Lexmark International, Inc.
Gabriel Robins University of Virginia
Chris Rodger Auburn University
James H. Schmer] University of Connecticut
Paul S. Schnare Eastern Kentucky University
Leo Schneider John Carroll University
Debra Diny Scott University of Wisconsin at Green Bay
Gary E. Stevens Hartwick College
Dalton Tarwater Texas Tech University
Jeff Tecosky-Feldman Harvard University
W. L. Terwilliger Bowling Green State University
Donald Thompson Pepperdine University
Preface xiii
Thomas Upson Rochester Institute of Technology
W. D. Wallis Southern Illinois University
Larry West Virginia Commonwealth University
Yixin Zhang University of Nebraska at Omaha
Special thanks are due to Douglas Shier of Clemson University for the outstanding work
he did in reviewing the manuscripts of all five editions. Thanks are also due to Joan Shier
for letting Doug review the fourth and fifth editions.
The translation for the dedication is due to Dr. Yvonne Panaro of Northern Virginia
Community College. Thank you, Yvonne, and thank you, Patter (Patricia Wickes Thurston),
for your role in obtaining the translation.
Atext of this length requires the use of many references. The members of the library staff
of Rose-Hulman Institute of Technology were always available when books and articles
were needed, so it is only fitting to express one’s appreciation for the efforts of John Robson,
Sondra Nelson, Dong Chao, Jan Jerrell, and especially Amy Harshbarger and Margaret Ying.
In addition, Keith Hoover and Raymond Bland are thanked for rescuing the author from
the perils of many hardware problems.
The last, and surely the most important, note of thanks belongs once again to the ever-
patient and encouraging now-retired secretary of the Rose-Hulman mathematics depart-
ment — Mrs. Mary Lou McCullough. Thank you for the fifth time, Mary Lou, for all of
your work!
Alas, the remaining errors, ambiguities, and misleading comments are once again the
sole responsibility of the author.
RPG.
Terre Haute, Indiana
Contents
PART 1
Fundamentals of Discrete Mathematics 1
Fundamental Principles of Counting 3
1.] The Rules of Sum and Product 3
1.2 Permutations 6
1.3 Combinations: The Binomial Theorem 14
1.4 Combinations with Repetition 26
1.5 The Catalan Numbers (Optional) 36
1.6 Summary and Historical Review 41
Fundamentals of Logic 47
2.1 Basic Connectives and Truth Tables 47
2.2 Logical Equivalence: The Laws of Logic 55
2.3 Logical Implication: Rules of Inference 67
2.4 The Use of Quantifiers 86
2.5 Quantifiers, Definitions, and the Proofs of Theorems 103
2.6 Summary and Historical Review 117
Set Theory 123
3.1 Sets and Subsets 123
3.2 Set Operations and the Laws of Set Theory 136
3.3 Counting and Venn Diagrams 148
3.4 A First Word on Probability 150
3.5 The Axioms of Probability (Optional) 157
3.6 Conditional Probability: Independence (Optional) 166
3.7 Discrete Random Variables (Optional) 175
3.8 Summary and Historical Review 186
XV
xvi Contents
4 Properties of the Integers: Mathematical Induction 193
4.] The Well-Ordering Principle: Mathematical Induction 193
4.2 Recursive Definitions 210
4.3 The Division Algorithm: Prime Numbers 221
4.4 The Greatest Common Divisor: The Euclidean Algorithm 231
4.5 The Fundamental Theorem of Arithmetic 237
4.6 Summary and Historical Review 242
5 Relations and Functions 247
5.1 Cartesian Products and Relations 248
5.2 Functions: Plain and One-to-One 252
5.3 Onto Functions: Stirling Numbers of the Second Kind 260
5.4 Special Functions 267
5.5 The Pigeonhole Principle 273
5.6 Function Composition and Inverse Functions 278
5.7 Computational Complexity 289
5.8 Analysis of Algorithms 294
5.9 Summary and Historical Review 302
6 Languages: Finite State Machines 309
6.1 Language: The Set Theory of Strings 309
6.2 Finite State Machines: A First Encounter 319
6.3 Finite State Machines: A Second Encounter 326
6.4 Summary and Historical Review 332
7 Relations: The Second Time Around 337
7.1 Relations Revisited: Properties of Relations 337
7.2 Computer Recognition: Zero-One Matrices and Directed Graphs 344
7.3 Partial Orders: Hasse Diagrams 356
7.4 Equivalence Relations and Partitions 366
75 Finite State Machines: The Minimization Process 371
7.6 Summary and Historical Review 376
PART 2
Further Topics in Enumeration 383
8 The Principle of Inclusion and Exclusion 385
8.1 The Principle of Inclusion and Exclusion 385
8.2 Generalizations of the Principle 397
8.3 Derangements: Nothing Is in Its Right Place 402
8.4 Rook Polynomials 404
8.5 Arrangements with Forbidden Positions 406
8.6 Summary and Historical Review 411
Contents xvii
9 Generating Functions 415
9.1 Introductory Examples 415
9.2 Definition and Examples: Calculational Techniques 418
9.3 Partitions of Integers 432
9.4 The Exponential Generating Function 436
9.5 The Summation Operator 440
9.6 Summary and Historical Review 442
Recurrence Relations 447
10.1 The First-Order Linear Recurrence Relation 447
10.2 The Second-Order Linear Homogeneous Recurrence Relation with Constant
Coefficients 456
10.3 The Nonhomogeneous Recurrence Relation 470
10.4 The Method of Generating Functions 482
10.5 A Special Kind of Nonlinear Recurrence Relation (Optional) 487
10.6 Divide-and-Conquer Algorithms (Optional) 496
10.6 Summary and Historical Review 505
PART 3
Graph Theory and Applications 511
ll An Introduction to Graph Theory 513
11.1 Definitions and Examples 513
11.2 Subgraphs, Complements, and Graph Isomorphism 520
11.3 Vertex Degree: Euler Trails and Circuits 530
11.4 Planar Graphs 540
11.5 Hamilton Paths and Cycles 556
11.6 Graph Coloring and Chromatic Polynomials 564
11.7 Summary and Historical Review 573
12 Trees 581
12.1 Definitions, Properties, and Examples 581
12.2 Rooted Trees 587
12.3 Trees and Sorting 605
12.4 Weighted Trees and Prefix Codes 609
12.5 Biconnected Components and Articulation Points 615
12.6 Summary and Historical Review 622
13 Optimization and Matching 631
13.1 Dijkstra’s Shortest-Path Algorithm 63]
13.2 Minimal Spanning Trees: The Algorithms of Kruskal and Prim 638
13.3 Transport Networks: The Max-Flow Min-Cut Theorem 644
13.4 Matching Theory 659
13.5 Summary and Historical Review 667
Contents
PART 4
Modern Applied Algebra 671
14 Rings and Modular Arithmetic 673
14.1 The Ring Structure: Definition and Examples 673
14.2 Ring Properties and Substructures 679
14.3 The Integers Modulon 686
14.4 Ring Homomorphisms and Isomorphisms 697
14.5 Summary and Historical Review 705
15 Boolean Algebra and Switching Functions 711
15.1 Switching Functions: Disjunctive and Conjunctive Normal Forms 711
15.2 Gating Networks: Minimal Sums of Products: Karnaugh Maps 719
15.3 Further Applications: Don’t-Care Conditions 729
15.4 The Structure of a Boolean Algebra (Optional) 733
15.5 Summary and Historical Review 742
Groups, Coding Theory, and Polya’s Method of Enumeration 745
16.] Definition, Examples, and Elementary Properties 745
16.2 Homomorphisms, Isomorphisms, and Cyclic Groups 752
16.3 Cosets and Lagrange’s Theorem 757
16.4 The RSA Cryptosystem (Optional) 759
16.5 Elements of Coding Theory 761
16.6 The Hamming Metric 766
16.7 The Parity-Check and Generator Matrices 769
16.8 Group Codes: Decoding with Coset Leaders 773
16.9 Hamming Matrices 777
16.10 Counting and Equivalence: Burnside’s Theorem 779
16.1] The Cycle Index 785
16.12 The Pattern Inventory: Polya’s Method of Enumeration 789
16.13 Summary and Historical Review 794
7 Finite Fields and Combinatorial Designs 799
17.] Polynomial Rings 799
17.2 Irreducible Polynomials: Finite Fields 806
17.3 Latin Squares 815
17.4 Finite Geometries and Affine Planes 820
17.5 Block Designs and Projective Planes 825
17.6 Summary and Historical Review 830
Appendix1 ‘Exponential and Logarithmic Functions A-1
Appendix 2 Matrices, Matrix Operations, and Determinants A-11
Appendix 3 Countable and Uncountable Sets A-23
Contents xix
Solutions S-1
Index = I-1
PART
FUNDAMENTALS
OF DISCRETE
MATHEMATICS
Fundamental
Principles of
Counting
numeration, or counting, may strike one as an obvious process that a student learns
when first studying arithmetic. But then, it seems, very little attention is paid to further
development in counting as the student turns to “more difficult” areas in mathematics, such
as algebra, geometry, trigonometry, and calculus. Consequently, this first chapter should
provide some warning about the seriousness and difficulty of “mere” counting.
Enumeration does not end with arithmetic. It also has applications in such areas as coding
theory, probability and statistics, and in the analysis of algorithms. Later chapters will offer
some specific examples of these applications.
As we enter this fascinating field of mathematics, we shall come upon many problems that
are very simple to state but somewhat “sticky” to solve. Thus, be sure to learn and understand
the basic formulas — but do nor rely on them too heavily. For without an analysis of each
problem, a mere knowledge of formulas is next to useless. Instead, welcome the challenge
to solve unusual problems or those that are different from problems you have encountered
in the past. Seek solutions based on your own scrutiny, regardless of whether it reproduces
what the author provides. There are often several ways to solve a given problem.
1.1
The Rules of Sum and Product
Our study of discrete and combinatorial mathematics begins with two basic principles of
counting: the rules of sum and product. The statements and initial applications of these
rules appear quite simple. In analyzing more complicated problems, one is often able to
break down such problems into parts that can be solved using these basic principles. We
want to develop the ability to “decompose” such problems and piece together our partial
solutions in order to arrive at the final answer. A good way to do this is to analyze and solve
many diverse enumeration problems, taking note of the principles being used. This is the
approach we shall follow here.
Our first principle of counting can be stated as follows:
: Rule ‘Sum: If a first task can be performed in m ways, while a second task can
et ed in n ways, and the two tasks cannot be performed simultaneously, then
‘gither task can be accomplished in any one of m + n ways.
4 Chapter 1 Fundamental Principles of Counting
Note that when we say that a particular occurrence, such as a first task, can come about in m
ways, these m ways are assumed to be distinct, unless a statement is made to the contrary.
This will be true throughout the entire text.
Acollege library has 40 textbooks on sociology and 50 textbooks dealing with anthropology.
EXAMPLE 1.1
By the rule of sum, a student at this college can select among 40 + 50 = 90 textbooks in
order to learn more about one or the other of these two subjects.
The rule can be extended beyond two tasks as long as no pair of tasks can occur simultane-
EXAMPLE 1.2
ously. For instance, a computer science instructor who has, say, seven different introductory
books each on C++, Java, and Perl can recommend any one of these 21 books to a student
who is interested in learning a first programming language.
The computer science instructor of Example 1.2 has two colleagues. One of these col-
EXAMPLE 1.3
leagues has three textbooks on the analysis of algorithms, and the other has five such
textbooks. If n denotes the maximum number of different books on this topic that this
instructor can borrow from them, then 5 < n < 8, for here both colleagues may own copies
of the same textbook(s).
The following example introduces our second principle of counting.
Intrying to reach a decision on plant expansion, an administrator assigns 12 of her employees
EXAMPLE 1.4
to two committees. Committee A consists of five members and is to investigate possible
favorable results from such an expansion. The other seven employees, committee B, will
scrutinize possible unfavorable repercussions. Should the administrator decide to speak to
just one committee member before making her decision, then by the rule of sum there are
12 employees she can call upon for input. However, to be a bit more unbiased, she decides
to speak with a member of committee A on Monday, and then with a member of committee
B on Tuesday, before reaching a decision. Using the following principle, we find that she
can select two such employees to speak with in 5 X 7 = 35 ways.
The Rule of Product: If a procedure can be broken down into first and second stages,
and if there are m possible outcomes for the first stage and if, for each of these outcomes,
there are n possible outcomes for the second stage, then the total procedure can be carried
out, in the designated order, in mn ways. -
The drama club of Central University is holding tryouts for a spring play. With six men and
EXAMPLE 1.5
eight women auditioning for the leading male and female roles, by the rule of product the
director can cast his leading couple in 6 X 8 = 48 ways.
Here various extensions of the rule are illustrated by considering the manufacture of license
EXAMPLE 1.6
plates consisting of two letters followed by four digits.
1.1 The Rules of Sum and Product 5
a) If no letter or digit can be repeated, there are 26X25X10X9X8X7=
3,276,000 different possible plates.
b) With repetitions of letters and digits allowed, 26 X 26 xX 10x 10x 10x 10 =
6,760,000 different license plates are possible.
c) If repetitions are allowed, as in part (b), how many of the plates have only vowels (A,
E, I, O, U) and even digits? (0 is an even integer.)
In order to store data, acomputer’s main memory contains a large collection of circuits, each
EXAMPLE 1.7
of which is capable of storing a bit — that is, one of the binary digits 0 or 1. These storage
circuits are arranged in units called (memory) cells. To identify the cells in a computer’s
main memory, each is assigned a unique name called its address. For some computers,
such as embedded microcontrollers (as found in the ignition system for an automobile), an
address is represented by an ordered list of eight bits, collectively referred to as a byte. Using
the rule of product, there are 2X 2X2*2X2xX*2X2X2 = 2° = 256 such bytes. So
we have 256 addresses that may be used for cells where certain information may be stored.
A kitchen appliance, such as a microwave oven, incorporates an embedded microcon-
troller. These “small computers” (such as the PICmicro microcontroller) contain thousands
of memory cells and use two-byte addresses to identify these cells in their main memory.
Such addresses are made up of two consecutive bytes, or 16 consecutive bits. Thus there
are 256 X 256 = 28 x 28 = 2!© = 65,536 available addresses that could be used to iden-
tify cells in the main memory. Other computers use addressing systems of four bytes. This
32-bit architecture is presently used in the Pentium’ processor, where there are as many
as 28 x 28 x 28 x 28 = 272 = 4,294,967,296 addresses for use in identifying the cells in
main memory. When a programmer deals with the UltraSPARC? or Itanium’ processors, he
or she considers memory cells with eight-byte addresses. Each of these addresses comprises
8 X 8 = 64 bits, and there are 2% = 18,446,744,073,709,551,616 possible addresses for
this architecture. (Of course, not all of these possibilities are actually used.)
At times it is necessary to combine several different counting principles in the solution of
EXAMPLE 1.8
one problem. Here we find that the rules of both sum and product are needed to attain the
answer.
At the AWL corporation Mrs. Foster operates the Quick Snack Coffee Shop. The menu
at her shop is limited: six kinds of muffins, eight kinds of sandwiches, and five beverages
(hot coffee, hot tea, iced tea, cola, and orange juice). Ms. Dodd, an editor at AWL, sends
her assistant Car] to the shop to get her lunch— either a muffin and a hot beverage or a
sandwich and a cold beverage.
By the rule of product, there are 6 X 2 = 12 ways in which Carl can purchase a muffin and
hot beverage. A second application of this rule shows that there are 8 X 3 = 24 possibilities
for a sandwich and cold beverage. So by the rule of sum, there are 12 + 24 = 36 ways in
which Carl can purchase Ms. Dodd’s lunch.
* Pentium (R) is a registered trademark of the Intel Corporation.
*The UltraSPARC processor is manufactured by Sun (R) Microsystems, Inc.
STtanium (TM) is a trademark of the Intel Corporation.
6 Chapter 1 Fundamental Principles of Counting
1.2
Permutations
Continuing to examine applications of the rule of product, we turn now to counting linear
arrangements of objects. These arrangements are often called permutations when the objects
are distinct. We shall develop some systematic methods for dealing with linear arrangements,
starting with a typical example.
In a class of 10 students, five are to be chosen and seated in a row for a picture. How many
EXAMPLE 1.9
such linear arrangements are possible?
The key word here is arrangement, which designates the importance of order. If A, B,
C,...,1, J denote the 10 students, then BCEFI, CEFIB, and ABCFG are three such different
arrangements, even though the first two involve the same five students.
To answer this question, we consider the positions and possible numbers of students we
can choose from in order to fill each position. The filling of a position is a stage of our
procedure.
10 x 9 x 8 x 7 x 6
Ist 2nd 3rd 4th 5th
position position position position Position
Each of the 10 students can occupy the Ist position in the row. Because repetitions are
not possible here, we can select only one of the nine remaining students to fill the 2nd
position. Continuing in this way, we find only six students to select from in order to fill the
5th and final position. This yields a total of 30,240 possible arrangements of five students
selected from the class of 10.
Exactly the same answer is obtained if the positions are filled from right to left—
namely, 6 X 7 X 8 xX 9 X 10. If the 3rd position is filled first, the 1st position second, the
4th position third, the 5th position fourth, and the 2nd position fifth, then the answer is
9X 6X 10 X 8 X 7, still the same value, 30,240.
As in Example 1.9, the product of certain consecutive positive integers often comes
into play in enumeration problems. Consequently, the following notation proves to be quite
useful when we are dealing with such counting problems. It will frequently allow us to
express our answers in a more convenient form.
Definition 1.1 For an integer n > 0, n factorial (denoted n!) is defined by
0! = 1,
n! = (n)(n— 1)(n — 2)--- (3)(2)C),_ for n>.
One finds that 1! = 1, 2! = 2,3! = 6, 4! = 24, and 5! = 120. In addition, for each n > 0,
(n+1)!= (n+ 1)(n}).
Before we proceed any further, let us try to get a somewhat better appreciation for how
fast n! grows. We can calculate that 10! = 3,628,800, and it just so happens that this is
exactly the number of seconds in six weeks. Consequently, 11! exceeds the number of
seconds in one year, 12! exceeds the number in 12 years, and 13! surpasses the number of
seconds in a century.
1.2 Permutations 7
If we make use of the factorial notation, the answer in Example 1.9 can be expressed in
the following more compact form:
10X9X8X7X6=10X9X8X7X 6
5X4X3X2X1_
X ——_—_
= _..
10!
9x8 6 7 7 5xX4xX3x*2x1 5!
Definition 1.2 Given a collection of n distinct objects, any (linear) arrangement of these objects is called
a permutation of the collection.
Starting with the letters a, b, c, there are six ways to arrange, or permute, all of the letters:
abc, acb, bac, bea, cab, cba. If we are interested in arranging only two of the letters at a
time, there are six such size-2 permutations: ab, ba, ac, ca, be, cb.
If there are n distinct objects and r is an integer, with 1 <r <n, then by the mule of
product, the number of permutations of size r for the n objects is
Paa,r)= n X (a~-1)X(H—-2)X%---X@—r4))
Ist and 3rd rth
position position —_position pesition
ove te ome (n — n(n =r — 1) ---(3)(Q))
= OO DO) Ot X Ora)ODOM
n!
“(ant
%
Forr = 0, P(n, 0) = 1 =n!/(n —0)!, so P(n, r) = n!/(n — r)! holds for allO <r <n.
A special case of this result is Example 1.9, where n = 10, r = 5, and P(10, 5) = 30,240.
When permuting all of the n objects in the collection, we have r = n and find that P(n, n) =
ni/O! = nl.
Note, for example, that if n > 2, then P(n, 2) = n!/(n — 2)! =n(n — 1). When n > 3
one finds that P(n, n — 3) = n!l/[n — (n — 3)]! = n!/3! = (2)(m — 1)(n — 2) -- - (5)(A).
The number of permutations of size r, where 0 <r <n, from a collection of n objects,
is P(n, r) =n!/(n —r)!. (Remember that P(n, r) counts (linear) arrangements in which
the objects cannot be repeated.) However, if repetitions are allowed, then by the rule of
product there are n’” possible arrangements, with r > 0.
EXAMPLE 1.10 The number of permutations of the letters in the word COMPUTRR is 8!. If only five of the
: letters are used, the number of permutations (of size 5) is P(8, 5) = 8!/(8 — 5)! = 81/3! =
6720. If repetitions of letters are allowed, the number of possible 12-letter sequences is
8! = 6.872 x 101°
EXAMPLE 1.11 Unlike Example 1.10, the number of (linear) arrangements of the four letters in BALL is
: 12, not 4! (= 24), The reason is that we do not have four distinct letters to arrange. To get
the 12 arrangements, we can list them as in Table 1.1 (a).
The symbo] “=” is read “is approximately equal to.”
Chapter 1 Fundamental Principles of Counting
Table 1.1
A B L L A B lk kL A B bk |&
A L BL A L, B IL A lo. B L
A L L B A lL, leo B A lb L, B
B A L L B A Ll, Ib B A lb Ly
B L A L B L, A IL B lb A Ll
B L L A B lk Lb A B lb lL, A
L A BL L A B tL Ir A B |
L A L B L; A I. B L, A LL, B
L BA L L B A IL, Ir B A |
L BL A L B lL A Ir B L, A
L L A B L; Ie, AB I, L, AB
L LBA L bb B A lI. L, B A
(b)
aS —
If the two L’s are distinguished as L;, L2, then we can use our previous ideas on per-
mutations of distinct objects; with the four distinct symbols B, A, L;, Lz, we have 4! = 24
permutations. These are listed in Table 1.1(b). Table 1.1 reveals that for each arrangement
in which the L’s are indistinguishable there corresponds a pair of permutations with distinct
L’s. Consequently,
2 X (Number of arrangements of the letters B, A, L, L)
= (Number of permutations of the symbols B, A, L;, L2),
and the answer to the original problem of finding all the arrangements of the four letters in
BALL is 4!/2 = 12.
Using the idea developed in Example 1.11, we now consider the arrangements of all nine
EXAMPLE 1.12
letters in DATABASES.
There are 3! = 6 arrangements with the A’s distinguished for each arrangement in
which the A’s are not distinguished. For example, DA;TA,BA3;SES, DA, TA;BA)SES,
DAgTA, BA3SES, DA,TA3BA,SES, DA3TA; BAoSES, and DA3;TA2BA,) SES all correspond
to DATABASES, when we remove the subscripts on the A’s. In addition, to the arrange-
ment DA; TA2BA3SES there corresponds the pair of permutations DA; TA2BA3S,;ES> and
DA, TA2BA3S2ES;, when the S’s are distinguished. Consequently,
(2!)(3!)(Number of arrangements of the letters in DATABASES)
= (Number of permutations of the symbols D, A, T, Ao, B, A3, S;, E, S2),
so the number of arrangements of the nine letters in DATABASES is 9!/(2! 3!) = 30,240.
Before stating a general principle for arrangements with repeated symbols, note that in our
prior two examples we solved a new type of problem by relating it to previous enumeration
principles. This practice is common in mathematics in general, and often occurs in the
derivations of discrete and combinatorial formulas.
1.2 Permutations 9
If there are 1 objects with n, indistinguishable objects of a first type, nz indistinguishable
objects of a second type, ..., and n, indistinguishable objects of an rth type, where
!
fy +g +--+, = n, then there are eR (linear) arrangements of the given
jiftaie+ +: Ay!
n objects. ’
The MASSASAUGA is a brown and white venomous snake indigenous to North America.
EXAMPLE 1.13
Arranging all of the letters in MASSASAUGA, we find that there are
10!
sun
possible arrangements. Among these are
7!
——_—— = 840
311 ds di id!
in which all four A’s are together. To get this last result, we considered all arrangements of
the seven symbols AAAA (one symbol), S, S, S, M, U, G.
Determine the number of (staircase) paths in the x y-plane from (2, 1) to (7, 4), where each
EXAMPLE 1.14
such path is made up of individual steps going one unit to the right (R) or one unit upward
(U). The blue lines in Fig. 1.1 show two of these paths.
y y
4 t—— _— 7 4 4
3 ) 3 |-
2 -—-4 | 2 —
if} | |
|
i a j | | x
1 2 3 4 5 6 7 1 2 3 4 5 6 7
(a) R,U,R,R,U,R,R,U (b) U,R,R,R,U,U,R,R
Figure 1.1
Beneath each path in Fig. 1.1 we have listed the individual steps. For example, in part
(a) the list R, U, R, R, U, R, R, U indicates that starting at the point (2, 1), we first move
one unit to the right [to (3, 1)], then one unit upward [to (3, 2)], followed by two units to
the right [to (S, 2)], and so on, until we reach the point (7, 4). The path consists of five R’s
for moves to the right and three U’s for moves upward.
The path in part (b) of the figure is also made up of five R’s and three U’s. In general,
the overall trip from (2, 1) to (7, 4) requires 7 — 2 = 5 horizontal moves to the right and
4 — ] =3 vertical moves upward. Consequently, each path corresponds to a list of five
R’s and three U’s, and the solution for the number of paths emerges as the number of
arrangements of the five R’s and three U’s, which is 8!/(5! 3!) = 56.
10 Chapter 1 Fundamental Principles of Counting
We now do something a bit more abstract and prove that if n and k are positive integers with
EXAMPLE 1.15 n = 2k, then n!/2* is an integer. Because our argument relies on counting, it is an example
of a combinatorial proof.
Consider the u symbols x1, x), X2, X2, ..., X%, Xx. The number of ways in which we can
arrange all of these n = 2k symbols is an integer that equals
n! n}
Q2)--.21ee!
—
Qk
k factors
of 2!
Finally, we will apply what has been developed so far to a situation in which the arrange-
ments are no longer linear.
EXAMPLE 1.16 | If six people, designated as A, B,..., F, are seated about a round table, how many different
circular arrangements are possible, if arrangements are considered the same when one can
be obtained from the other by rotation? [In Fig. 1.2, arrangements (a) and (b) are considered
identical, whereas (b), (c), and (d) are three distinct arrangements.]
A C A D
D B F D B D E A
Cc E E A E C F C
F B F B
(a) (b) (c) (d)
Figure 1.2
We shall try to relate this problem to previous ones we have already encountered. Con-
sider Figs. 1.2(a) and (b). Starting at the top of the circle and moving clockwise, we list
the distinct linear arrangements ABEFCD and CDABEF, which correspond to the same
circular arrangement. In addition to these two, four other linear arrangements — BEFCDA,
DABEFC, EFCDAB, and FCDABE
— are found to correspond to the same circular ar-
rangement as in (a) or (b). So inasmuch as each circular arrangement corresponds to six
linear arrangements, we have 6 X (Number of circular arrangements of A, B,..., F) =
(Number of linear arrangements of A, B,..., F) = 6!.
Consequently, there are 6!/6 = 5! = 120 arrangements of A, B,. . ., Faround the circular
table.
Suppose now that the six people of Example 1.16 are three married couples and that A, B,
EXAMPLE 1.17
and C are the females. We want to arrange the six people around the table so that the sexes
alternate. (Once again, arrangements are considered identical if one can be obtained from
the other by rotation.)
Before we solve this problem, let us solve Example 1.16 by an alternative method,
which will assist us in solving our present problem. If we place A at the table as shown in
Fig. 1.3(a), five locations (clockwise from A) remain to be filled. Using B, C,..., F to fill
1.2 Permutations 11
A
5 1 M3 M1
4 2 F3 F2
M2
(a) (b)
Figure 1.3
these five positions is the problem of permuting B, C, ..., F in a linear manner, and this
can be done in 5! = 120 ways.
To solve the new problem of alternating the sexes, consider the method shown in
Fig. 1.3(b). A (a female) is placed as before. The next position, clockwise from A, is marked
M1 (Male 1) and can be filled in three ways. Continuing clockwise from A, position F2
(Female 2) can be filled in two ways. Proceeding in this manner, by the rule of product,
there are 3 X 2 X 2 X 1 X | = 12 ways in which these six people can be arranged with no
two men or women seated next to each other.
ing on the slate? (iii) at least one physician appearing on
EXERCISES 1.1 AND 1.2 the slate?
5. While on a Saturday shopping spree Jennifer and Tiffany
1. During a local campaign, eight Republican and five Demo-
witnessed two men driving away from the front of a jewelry
cratic candidates are nominated for president of the school
shop, just before a burglar alarm started to sound. Although ev-
board.
erything happened rather quickly, when the two young ladies
a) If the president is to be one of these candidates, how were questioned they were able to give the police the following
many possibilities are there for the eventual winner? information about the license plate (which consisted of two let-
b) How many possibilities exist for a pair of candidates ters followed by four digits) on the get-away car. Tiffany was
(one from each party) to oppose each other for the eventual sure that the second letter on the plate was either an O or a Q and
election? the last digit was either a 3 or an 8. Jennifer told the investigator
c) Which counting principle is used in part (a)? in that the first letter on the plate was either a C or a G and that the
part (b)? first digit was definitely a 7. How many different license plates
will the police have to check out?
2. Answer part (c) of Example 1.6.
6. To raise money for a new municipal pool, the chamber of
3. Buick automobiles come in four models, 12 colors, three commerce in a certain city sponsors arace. Each participant pays
engine sizes, and two transmission types. (a) How many distinct a $5 entrance fee and has a chance to win one of the different-
Buicks can be manufactured? (b) If one of the available colors sized trophies that are to be awarded to the first eight runners
is blue, how many different blue Buicks can be manufactured? who finish.
4, The board of directors of a pharmaceutical corporation has a) If 30 people enter the race, in how many ways will it be
10 members. An upcoming stockholders’ meeting is scheduled possible to award the trophies?
to approve a new slate of company officers (chosen from the 10 b) If Roberta and Candice are two participants in the race,
board members). in how many ways can the trophies be awarded with these
a) How many different slates consisting of a president, vice two runners among the top three?
president, secretary, and treasurer can the board present to 7. Acertain “Burger Joint” advertises that a customer can have
the stockholders for their approval? his or her hamburger with or without any or all of the fol-
b) Three members of the board of directors are physicians. lowing: catsup, mustard, mayonnaise, lettuce, tomato, onion,
How many slates from part (a) have (i) a physician nomi- pickle, cheese, or mushrooms. How many different kinds of
nated for the presidency? (ii) exactly one physician appear- hamburger orders are possible?
12 Chapter 1 Fundamental Principles of Counting
8. Matthew works as a computer operator at a small univer- b) How many different round trips can Linda travel from
sity. One evening he finds that 12 computer programs have been town A to town C and back to town A?
submitted earlier that day for batch processing. In how many ¢) How many of the round trips in part (b) are such that
ways can Matthew order the processing of these programs if the return trip (from town C to town A) is at least partially
(a) there are no restrictions? (b) he considers four of the pro- different from the route Linda takes from town A to town
grams higher in priority than the other eight and wants to process C? (For example, if Linda travels from town A to town C
those four first? (c) he first separates the programs into four of along roads R, and Rg, then on her return she might take
top priority, five of lesser priority, and three of least priority, roads Rg and R3, or roads R7 and Ro, or road Ro, among
and he wishes to process the 12 programs in such a way that the other possibilities, but she does not travel on roads Rg
top-priority programs are processed first and the three programs and R;.)
of least priority are processed last?
12. List all the permutations for the letters a, c, t.
9. Patter’s Pastry Parlor offers eight different kinds of pastry 13. a) How many permutations are there for the eight letters
and six different kinds of muffins. In addition to bakery items
a, c, f, g, 1, t, w, x?
one can purchase small, medium, or large containers of the fol-
lowing beverages: coffee (black, with cream, with sugar, or with b) Consider the permutations in part (a). How many start
cream and sugar), tea (plain, with cream, with sugar, with cream with the letter t? How many start with the letter t and end
and sugar, with lemon, or with lemon and sugar), hot cocoa, and with the letter c?
orange juice. When Carol comes to Patter’s, in how many ways 14, Evaluate each of the following.
can she order a) P(7,2) b) P(8,4) ce) P(10,7) d) P(12, 3)
a) one bakery item and one medium-sized beverage for 15. In how many ways can the symbols a, b, c, d, e, e, e, e, €
herself? be arranged so that no e is adjacent to another e?
b) one bakery item and one container of coffee for herself 16. An alphabet of 40 symbols is used for transmitting messages
and one muffin and one container of tea for her boss, Ms. in acommunication system. How many distinct messages (lists
Didio? of symbols) of 25 symbols can the transmitter generate if sym-
c) one piece of pastry and one container of tea for herself, bols can be repeated in the message? How many if 10 of the
one muffin and a container of orange juice for Ms. Didio, 40 symbols can appear only as the first and/or last symbols of
and one bakery item and one container of coffee for each the message, the other 30 symbols can appear anywhere, and
of her two assistants, Mr. Talbot and Mrs. Gillis? repetitions of all symbols are allowed?
10, Pamela has 15 different books. In how many ways can she 17. In the Internet each network interface of a computer is as-
place her books on two shelves so that there is at least one book signed one, or more, Internet addresses. The nature of these
on each shelf? (Consider the books in each arrangement to be Internet addresses is dependent on network size. For the In-
stacked one next to the other, with the first book on each shelf ternet Standard regarding reserved network numbers (STD 2),
at the left of the shelf.) each address is a 32-bit string which falls into one of the fol-
lowing three classes: (1) A class A address, used for the largest
11. Three small towns, designated by A, B, and C, are inter-
networks, begins with a 0 which is then followed by a seven-bit
connected by a system of two-way roads, as shown in Fig. 1.4.
network number, and then a 24-bit local address. However, one
is restricted from using the network numbers of all 0’s or all
1’s and the local addresses of all 0’s or all 1’s. (2) The class
B address is meant for an intermediate-sized network. This ad-
dress starts with the two-bit string 10, which is followed by a
14-bit network number and then a 16-bit local address. But the
local addresses of all 0’s or all 1’s are not permitted. (3) Class C
addresses are used for the smallest networks. These addresses
consist of the three-bit string 110, followed by a 21-bit network
number, and then an eight-bit local address. Once again the local
addresses of all 0’s or all 1’s are excluded. How many different
addresses of each class are available on the Internet, for this
Internet Standard?
Figure 1.4 18. Morgan is considering the purchase of a low-end computer
system. After some careful investigating, she finds that there are
a) In how many ways can Linda travel from town A to seven basic systems (each consisting of a monitor, CPU, key-
town C? board, and mouse) that meet her requirements. Furthermore, she
1.2 Permutations 13
also plans to buy one of four modems, one of three CD ROM 24. Show that for all integers n, r > 0, ifn +1 > r, then
drives, and one of six printers. (Here each peripheral device of n+1
a given type, such as the modem, is compatible with all seven Pnvin=(*1.) P(n,r).
basic systems.) In how many ways can Morgan configure her
25, Find the value(s) of n in each of the following:
low-end computer system?
(a) P(n, 2) = 90, (b) P(n, 3) = 3P(n, 2), and
19, Acomputer science professor has seven different program- (c) 2P(n, 2) +50 = P(2n, 2).
ming books on a bookshelf. Three of the books deal with C++,
the other four with Java. In how many ways can the professor 26. How many different paths in the xy-plane are there from
arrange these books on the shelf (a) if there are no restrictions? (0, 0) to (7, 7) if a path proceeds one step at a time by go-
(b) if the languages should alternate? (c) if all the C++ books ing either one space to the right (R) or one space upward (U)?
must be next to each other? (d) if all the C++ books must be How many such paths are there from (2, 7) to (9, 14)? Can any
next to each other and all the Java books must be next to each general statement be made that incorporates these two results?
other?
27. a) How many distinct paths are there from (—1, 2, 0) to
20. Over the Internet, data are transmitted in structured blocks
(1, 3, 7) in Euclidean three-space if each move is one of
of bits called datagrams.
the following types?
a) In how many ways can the letters in DATAGRAM be
arranged? (H): (x, y, 2) > & +1, y, 2);
b) For the arrangements of part (a), how many have all (V): (x, y, 2) > (x, y + 1, 2);
three A’s together? (A): (x, y, 2) > (XZ +1)
21. a) How many arrangements are there of all the letters in b) How many such paths are there from (1, 0,5) to
SOCIOLOGICAL? (8, 1, 7)?
b) In how many of the arrangements in part (a) are A and c) Generalize the results in parts (a) and (b).
G adjacent?
28. a) Determine the value of the integer variable counter af-
c) In how many of the arrangements in part (a) are all the ter execution of the following program segment. (Here /,
vowels adjacent? j, and & are integer variables.)
22. How many positive integers n can we form using the digits
counter :=0
3, 4, 4, 5, 5, 6, 7 if we want n to exceed 5,000,000?
fori :=1tol12 do
23. Twelve clay targets (identical in shape) are arranged in four counter := counter+1
hanging columns, as shown in Fig. 1.5. There are four red tar- forj :=5to1l10do
gets in the first column, three white ones in the second column, counter := counter + 2
two green targets in the third column, and three blue ones in for k := 15 downto 8 do
the fourth column. To join her college drill team, Deborah must counter := counter + 3
break all 12 of these targets (using her pistol and only 12 bul-
lets) and in so doing must always break the existing target at b) Which counting principle is at play in part (a)?
the bottom of a column. Under these conditions, in how many
29. Consider the following program segment where i, j, and k
different orders can Deborah shoot down (and break) the 12
are integer variables.
targets?
for i :=1to12do
LL _| forj :=5
for k
to10do
:= 15 downto
8 do
print (i- j)*k
a) How many times is the print statement executed?
b) Which counting principle is used in part (a)?
30. A sequence of letters of the form abcba, where the expres-
sion is unchanged upon reversing order, is an example of a
palindrome (of five letters). (a) If a letter may appear more than
twice, how many palindromes of five letters are there? of six
letters? (b) Repeat part (a) under the condition that no letter
appears more than twice.
14 Chapter 1 Fundamental Principles of Counting
A B G
H C F
G D E
F E D
(a) (b) (c)
Figure 1.6
31. Determine the number of six-digit integers (no leading ze- b) If two of the people insist on sitting next to each other,
ros) in which (a) no digit may be repeated; (b) digits may be how many arrangements are possible?
repeated. Answer parts (a) and (b) with the extra condition that 36. a) In how many ways can eight people, denoted A,
the six-digit integer is (i) even; (ii) divisible by 5; (iii) divisible B,..., H be seated about the square table shown in Fig.
by 4.
1.6, where Figs. 1.6(a) and 1.6(b) are considered the same
32. a) Provide a combinatorial argument to show that if 7 and but are distinct from Fig. 1.6(c)?
k are positive integers with n = 3k, then n!/(3!)* is an in- b) If two of the eight people, say A and B, do not get along
teger. well, how many different seatings are possible with A and
b) Generalize the result of part (a). B not sitting next to each other?
33. a) In how many possible ways could a student answer a 37. Sixteen people are to be seated at two circular tables, one
10-question true-false test? of which seats 10 while the other seats six. How many different
seating arrangements are possible?
b) In how many ways can the student answer the test in
part (a) if it is possible to leave a question unanswered in 38. A committee of 15 —nine women and six men— is to be
order to avoid an extra penalty for a wrong answer? seated at a circular table (with 15 seats). In how many ways can
the seats be assigned so that no two men are seated next to each
34. How many distinct four-digit integers can one make from
other?
the digits 1, 3, 3, 7, 7, and 8?
39. Write a computer program (or develop an algorithm)
35. a) In how many ways can seven people be arranged about to determine whether there is a_ three-digit integer
a circular table? abc (= 100a + 10b + c) where abc = at + b'+ ct.
1.3
Combinations: The Binomial Theorem
The standard deck of playing cards consists of 52 cards comprising four suits: clubs, di-
amonds, hearts, and spades. Each suit has 13 cards: ace, 2, 3, ... , 9, 10, jack, queen,
king. If we are asked to draw three cards from a standard deck, in succession and without
replacement, then by the rule of product there are
52
X 51 X 50 = 2 = P(52,3
49! ©, 9)
possibilities, one of which is AH (ace of hearts), 9C (nine of clubs), KD (king of dia-
monds). If instead we simply select three cards at one time from the deck so that the order
of selection of the cards is no longer important, then the six permutations AH-9C-KD,
AH-KD-9C, 9C-AH-KD, 9C-KD-AH, KD-9C-AH, and KD-AH-9C all correspond to
just one (unordered) selection. Consequently, each selection, or combination, of three cards,
with no reference to order, corresponds to 3! permutations of three cards. In equation form
1.3 Combinations: The Binomial Theorem 5
this translates into
(3!) X (Number of selections of size 3 from a deck of 52)
= Number of permutations of size 3 for the 52 cards
52!
= P(52,3) = —.
( ) 49!
Consequently, three cards can be drawn, without replacement, from a standard deck in
52!/(3! 49!) = 22,100 ways.
If we start with,.n distinct objects, each selection, or combination, of r of these objects,
with no reference to order, corresponds to r! permutations of size r from the n objects.
Thus the number of combinations of size r from a collection of size n is
Pia,r) = n!
C(n,1) = ——— =
iG Osrsn
In addition to C(n, r) the symbol (”) is also frequently used. Both C(n, r) and (") are
sometimes read “n choose r.” Note that for all n > 0, C(n, 0) = C(n, n) = 1. Further, for
alln > 1, C(n, 1) = C(n,n— 1) =n. When0 <n <r, then C(n, r) = (7) = 0.
A word to the wise! When dealing with any counting problem, we should ask ourselves
about the importance of order in the problem. When order is relevant, we think in terms
of permutations and arrangements and the rule of product. When order is not relevant,
combinations could play a key role in solving the problem.
A hostess is having a dinner party for some members of her charity committee. Because
EXAMPLE 1.18
of the size of her home, she can invite only 11 of the 20 committee members. Order is not
important, so she can invite “the lucky 11” in C(20, 11) = (7°) = 20!/(11! 9!) = 167,960
ways. However, once the 11 arrive, how she arranges them around her rectangular dining
table is an arrangement problem. Unfortunately, no part of the theory of combinations and
permutations can help our hostess deal with “the offended nine” who were not invited.
Lynn and Patti decide to buy a PowerBall ticket. To win the grand prize for PowerBall
EXAMPLE 1.19
one must match five numbers selected from 1 to 49 inclusive and then must also match
the powerball, an integer from | to 42 inclusive. Lynn selects the five numbers (between
1 and 49 inclusive). This she can do in (%) ways (since matching does not involve order).
Meanwhile Patti selects the powerball — here there are (7) possibilities. Consequently, by
the rule of product, Lynn and Patti can select the six numbers for their PowerBall ticket in
(2) (7) = 80,089,128 ways.
a) A student taking a history examination is directed to answer any seven of 10 essay
EXAMPLE 1.20
questions. There is no concern about order here, so the student can answer the examina-
tion in
IO} 10! 10x9x8
7
16 Chapter 1 Fundamental Principles of Counting
b) If the student must answer three questions from the first five and four questions from
the last five, three questions can be selected from the first five in (3) = 10 ways, and
the other four questions can be selected in (3) = 5 ways. Hence, by the rule of product,
the student can complete the examination in (3)(3) = 10 X 5 = 50 ways.
c) Finally, should the directions on this examination indicate that the student must answer
seven of the 10 questions where at least three are selected from the first five, then there
are three cases to consider:
i) The student answers three of the first five questions and four of the last five: By
the rule of product this can happen in (3)(}) = 10 X 5 = 50 ways, as in part (b).
ii) Four of the first five questions and three of the last five questions are selected by
the student: This can come about in G) (3) = 5 X 10 = 50 ways — again by the
rule of product.
iii) The student decides to answer all five of the first five questions and two of the
last five: The rule of product tells us that this last case can occur in (2) (6) =
1 X 10 = 10 ways.
Combining the results for cases (i), (11), and (iii), by the rule of sum we find that the
student can make (3)(3) + (3)(3) + (2)(3) = 50 + 50 + 10 = 110 selections of seven (out
of 10) questions where each selection includes at least three of the first five questions.
EXAMPLE 1.21 a) At Rydell High School, the gym teacher must select nine girls from the junior and
: senior classes for a volleyball team. If there are 28 juniors and 25 seniors, she can
make the selection in () = 4,431,613,550 ways.
b) If two juniors and one senior are the best spikers and must be on the team, then the
rest of the team can be chosen in (*?) = 15,890,700 ways.
c) For a certain tournament the team must comprise four juniors and five seniors. The
teacher can select the four juniors in (2°) ways. For each of these selections she has
(2) ways to choose the five seniors. Consequently, by the rule of product, she can
select her team in (3) (2) = 1,087,836,750 ways for this particular tournament.
Some problems can be treated from the viewpoint of either arrangements or combina-
tions, depending on how one analyzes the situation. The following example demonstrates
this.
EXAMPLE 1.22 ] The gym teacher of Example 1.21 must make up four volleyball teams of nine girls each
: from the 36 freshman girls in her PE. class. In how many ways can she select these four
teams? Call the teams A, B, C, and D.
a) To form team A, she can select any nine girls from the 36 enrolled in (?$} ways. For
team B the selection process yields (7)) possibilities. This leaves ('3) and (3) possible
ways to select teams C and D, respectively. So by the rule of product, the four teams
can be chosen in
CS) No) (0) = (G2) (oe) (oe) (wo)
_— — 19
1.3 Combinations: The Binomial Theorem 7
b) For an alternative solution, consider the 36 students lined up as follows:
Ist 2nd 3rd 35th 36th
student student student — student student
To select the four teams, we must distribute nine A’s, nine B’s, nine C’s, and nine D’s in
the 36 spaces. The number of ways in which this can be done is the number of arrangements
of 36 letters comprising nine each of A, B, C, and D. This is now the familiar problem of
arrangements of nondistinct objects, and the answer is
36!
oT or oral ; as in part (a).
Our next example points out how some problems require the concepts of both arrange-
ments and combinations for their solutions.
The number of arrangements of the letters in TALLAHASSEE is
EXAMPLE 1.23
11!
= 831,600.
312)2'2! 1! 1!
How many of these arrangements have no adjacent A’s?
When we disregard the A’s, there are
———___ = 5040
212!2! 111!
ways to arrange the remaining letters. One of these 5040 ways is shown in the following
figure, where the arrows indicate nine possible locations for the three A’s.
E,E,S,T,L,L,S,H
PPT PTT]
Three of these locations can be selected in (3) = 84 ways, and because this is also possible
for all the other 5039 arrangements of E, E, S, T, L, L, 8, H, by the rule of product there
are 5040 < 84 = 423,360 arrangements of the letters in TALLAHASSEE with no consecu-
tive A’s.
Before proceeding we need to introduce a concise way of writing the sum of a list of
n+] terms like dy, Gm41, Qm+2,..+,@m+4n, Where m and nv are integers and n > 0. This
notation is called the Sigma notation because it involves the capital Greek letter £; we use
it to represent a summation by writing
m+n
Gin + Am4t + Qm42 +++ TF amin = ) aj.
=m
Here, the letter i is called the index of the summation, and this index accounts for all
integers starting with the lower limit m and continuing on up to (and including) the upper
limitm +h.
We may use this notation as follows.
7 7
1) s> Gd, =a,+a4+d5 +a, +47 = > a;, for there is nothing special about the
i=3 j=3
letter 7.
18 Chapter 1 Fundamental Principles of Counting
4 4
2) So? = 1? +2? +3? 44? = 30 = 5° k’, because 0? = 0.
i=l k=0
100 101 99
3) S° P= 17 41274137
+---4+ 1008 = SG -i = Skt).
i=11 yH=l2 k=10
10 10
4) 5°2i = 2(7) + 2(8) + 29) + 2(10) = 68 = 2(34) = 207 + 8 +9 + 10) = 2 Yi.
i=7 i=?
3 4 2
5) > a; = 43 >= y aj) = > Gy4}.
i=3 1=4 i=2
5
6) \ia=atatat+ata=Sa.
i=l
Furthermore, using this summation notation, we see that one can express the answer to
part (c) of Example 1.20 as
We shall find use for this new notation in the following example and in many other places
throughout the remainder of this book.
In the studies of algebraic coding theory and the theory of computer languages, we consider
EXAMPLE 1.24
certain arrangements, called strings, made up from a prescribed alphabet of symbols. If the
prescribed alphabet consists of the symbols 0, 1, and 2, for example, then 01, 11, 21, 12,
and 20 are five of the nine strings of length 2. Among the 27 strings of length 3 are 000,
012, 202, and 110.
In general, if m is any positive integer, then by the rule of product there are 3” strings of
length x for the alphabet 0, 1, and 2. Ifx = x;x2.x3 - - - x, is one of these strings, we define the
weight of x, denoted wt(x), by wt(x) = x; + x2 + x3 -+---+Xx,. Forexample, wt(12) = 3
and wt(22) = 4 for the case where n = 2; wt(101) = 2, wt(210) = 3, and wt(222) = 6 for
n = 3.
Among the 3!° strings of length 10, we wish to determine how many have even weight.
Such a string has even weight precisely when the number of 1’s in the string is even.
There are six different cases to consider. If the string x contaifis no 1’s, then each of the
10 locations in x can be filled with either 0 or 2, and by the rule of product there are 2'° such
strings. When the string contains two 1’s, the locations for these two 1’s can be selected in
(‘2) ways. Once these two locations have been specified, there are 2° ways to place either 0
or 2 in the other eight positions. Hence there are ey) 2° strings of even weight that contain
two |’s. The numbers of strings for the other four cases are given in Table 1.2.
Table 1.2
Number of 1’s_ | Number of Strings | Number of 1’s_ | Number of Strings
4 (19)2 g ()22
6 (5)2" 10 (io)
1.3 Combinations: The Binomial Theorem 19
Consequently, by the rule of sum, the number of strings of length 10 that have even
weight is 2! + (10)28 4 (19)26 4 (19)24 + (1) 22 4 (19) =y5_y (20)210-2n,
Often we must be careful of overcounting —a situation that seems to arise in what
may appear to be rather easy enumeration problems. The next example demonstrates how
overcounting may come about.
EXAMPLE 1.25 a) Suppose that Ellen draws five cards from a standard deck of 52 cards. In how many
ways can her selection result in a hand with no clubs? Here we are interested in counting
all five-card selections such as
i) ace of hearts, three of spades, four of spades, six of diamonds, and the jack of
diamonds.
ii) five of spades, seven of spades, ten of spades, seven of diamonds, and the king of
diamonds.
iii) two of diamonds, three of diamonds, six of diamonds, ten of diamonds, and the
jack of diamonds.
If we examine this more closely we see that Ellen is restricted to selecting her five
cards from the 39 cards in the deck that are not clubs. Consequently, she can make her
selection in (%) ways.
b) Now suppose we want to count the number of Ellen’s five-card selections that contain
at least one club. These are precisely the selections that were not counted in part (a).
And since there are C3) possible five-card hands in total, we find that
52 39
(’ ) ~ ( 5 ) = 2,598,960 — 575,757 = 2,023,203
of all five-card hands contain at least one club.
c) Can we obtain the result in part (b) in another way? For example, since Ellen wants to
have at least one club in the five-card hand, let her first select a club. This she can do in
(3) ways. And now she doesn’t care what comes up for the other four cards. So after
she eliminates the one club chosen from her standard deck, she can then select the
other four cards in Ci) ways. Therefore, by the rule of product, we count the number
of selections here as
13 1
( | CG) = 13 X 249,900 = 3,248,700.
Something here is definitely wrong! This answer is larger than that in part (b) by more
than one million hands. Did we make a mistake in part (b)? Or is something wrong
with our present reasoning?
For example, suppose that Ellen first selects
the three of clubs
and then selects
the five of clubs,
king of clubs,
seven of hearts, and
jack of spades.
20 Chapter 1 Fundamental Principles of Counting
If, however, she first selects
the five of clubs
and then selects
the three of clubs,
king of clubs,
seven of hearts, and
jack of spades,
is her selection here really different from the prior selection we mentioned? Unfortu-
nately, no! And the case where she first selects
the king of clubs
and then follows this by selecting
the three of clubs,
five of clubs,
seven of hearts, and
jack of spades
is not different from the other two selections mentioned earlier.
Consequently, this approach is wrong because we are overcounting
— by consid-
ering like selections as if they were distinct.
d) But is there any other way to arrive at the answer in part (b)? Yes! Since the five-card
hands must each contain at least one club, there are five cases to consider. These are
given in Table 1.3. From the results in Table 1.3 we see, for example, that there are
(5) (7?) five-card hands that contain exactly two clubs. If we are interested in having
exactly three clubs in the hand, then the results in the table indicate that there are
(3)(2) such hands.
Table 1.3
Number of Ways Number of Number of Ways
Number to Select This Cards That to Select This
of Clubs | Number of Clubs | Are Not Clubs | Number of Nonclubs
1 (13) 4 (2
2 Cs) 3 (
3 (3) (3)
4 (4) (7)
5 ('5) (0)
1.3 Combinations: The Binomial Theorem 21
Since no two of the cases in Table 1.3 have any five-card hand in common, the number
of hands that Ellen can select with at least one club is
CVG) )G) GIG) (IG) +s))
2s")
It
(13)(82,251) + (78)(9139) + (286)(741) + (715)(39) + (1287)(1)
2,023,203.
We shall close this section with three results related to the concept of combinations.
First we note that for integers n,r, with n > r > 0, (1) = (,,”,.). This can be established
algebraically from the formula for ("), but we prefer to observe that when dealing with
a selection of size r from a collection of n distinct objects, the selection process leaves
behind n — r objects. Consequently, (') = (,,” ,) affirms the existence of a correspondence
between the selections of size r (objects chosen) and the selections of size n — r (objects
left behind). An example of this correspondence is shown in Table 1.4, where n = 5,r = 2,
and the distinct objects are 1, 2, 3, 4, and 5. This type of correspondence will be more
formally defined in Chapter 5 and used in other counting situations.
Table 1.4
Selections of Size r = 2 Selections of Size n — r = 3
(Objects Chosen) (Objects Left Behind)
l. 1,2 6. 2,4 l. 3,4,5 6. 1,3,5
2. 1,3 7. 2,5 2. 2,4,5 7. 1,3,4
3. 1,4 8. 3,4 3. 2,3,5 8. 1,2,5
4, 1,5 9. 3,5 4. 2,3,4 9. 1,2,4
5. 2,3 10. 4,5 5. 1,4,5 10. 1,2,3
Our second result is a theorem from our past experience in algebra.
THEOREM 1.1 The Binomial Theorem. If x and y are variables and n is a positive integer, then
(x + yy" _ ({)°9" + (That + (S)t 4...
4 (, _n is" | ly) n
4 (")x"3° _— “fn
> (7) a kK
k=0
Before considering the general proof, we examine a special case. Ifn = 4, the coefficient
of x*y* in the expansion of the product
(xt+y)@+y)@t+y) a+)
Ist 2nd 3rd 4th
factor factor factor factor
22 Chapter 1 Fundamental Principles of Counting
is the number of ways in which we can select two x’s from the four x’s, one of which is
available in each factor. (Although the x’s are the same in appearance, we distinguish them
as the x in the first factor, the x in the second factor, ... , and the x in the fourth factor.
Also, we note that when we select two x’s, we use two factors, leaving us with two other
factors from which we can select the two y’s that are needed.) For example, among the
possibilities, we can select (1) x from the first two factors and y from the last two or (2) x
from the first and third factors and y from the second and fourth. Table 1.5 summarizes the
six possible selections.
Table 1.5
Factors Selected for x Factors Selected for y
(1) 1,2 (1) 3,4
(2) 1,3 (2) 2,4
(3) 1,4 (3) 2,3
(4) 2,3 (4) 1,4
(5) 2,4 (5) 1,3
(6) 3,4 (6) 1,2
Consequently, the coefficient of x” y? in the expansion of (x + y)* is (5) = 6, the number
of ways to select two distinct objects from a collection of four distinct objects.
Now we turn to the proof of the general case.
Proof: In the expansion of the product
(x+y) @+y)@+y)---
t+ y)
Ist 2nd 3rd ath
factor factor factor factor
the coefficient of x y"~*, where 0 < k <n, is the number of different ways in which we
can select k x’s [and consequently (n — k) y’s] from the n available factors. (One way, for
example, is to choose x from the first k factors and y from the last n — k factors.) The total
number of such selections of size k from a collection of size n is C(n, k) = (i), and from
this the binomial theorem follows.
In view of this theorem, () is often referred to as a binomial coefficient. Notice that it
is also possible to express the result of Theorem 1.1 as
“ n
(x+y) = » (, — jot
k=0
a) From the binomial theorem it follows that the coefficient of x* y’ in the expansion of
EXAMPLE 1.26
(x + y)’ is 3) = (§) = 21.
b) To obtain the coefficient of a*b? in the expansion of (2a — 3b)’, replace 2a by x and
—3b by y. From the binomial theorem the coefficient of x*y? in (x + y)’ is (2), and
(2)x° y? = (2)(2a)5(—3b)? = (2)(2)°(—3)*a*b? = 6048a°D?.
1.3 Combinations: The Binomial Theorem 23
COROLLARY 1.1 For each integer n > 0,
a) () + (i) +) +--+ + (2) = 2", and
b) (0) —G) + G)— + CD") = 0.
Proof: Part (a) follows from the binomial theorem when we set x = y = 1. When x = —1
and y = 1, part (b) results.
Our third and final result generalizes the binomial theorem and is called the multinomial
theorem.
THEOREM 1.2 Ry 2
For positive integers n, t, the coefficient of x}'x,?x;° 3
- + - x; in the expansion of
(x1 +.x2 +243 +++++-x;)” is
n!
>
ni!no!n3!---n,!
where each n; is an integer with 0 <n; <n, for all 1 <i<t, anda; +n2+n34+-+-+
ny = Nn.
Proof: As in the proof of the binomial theorem, the coefficient of x}1x;?x4° +--+ x;" is the
number of ways we can select x; from, of the n factors, x2 from n2 of then — n; remaining
factors, x; from n3 of the n — n, — n2 now remaining factors, ..., and x; from n, of the
lastn —n) —nz — 13 —---—n,_; =n, remaining factors. This can be carried out, as in
part (a) of Example 1.22, in
n\({n—-—ny\(n—-Any-m Nh—-Ahy — 2 — Ng Ny]
Hy n2 N3 ny;
ways. We leave to the reader the details of showing that this product is equal to
n!
nyinogtngt--- ay
which is also written as
nh
Hy, 2, 73,..., M4
and is called a multinomial coefficient. (When t = 2 this reduces to a binomial coefficient.)
a) In the expansion of (x + y + z)’ it follows from the multinomial theorem that the
EXAMPLE 1.27
coefficient of x? y*z3 is (,33) = 44 = 210, while the coefficient of xyz? is (, {.5) =
42 and that of 3,4 x°z* is3 (, 44) =-_7!
yom == 35.
b) Suppose we need to know the coefficient of a*b*c?d° in the expansion of
(a + 2b — 3c + 2d +5)'°. If we replace a by v, 2b by w, —3c by x, 2d by y, and
5 by z, then we can apply the multinomial theorem to (v+w+x+y+z)!®
and determine the coefficient of v?w*x?y°z* as (53.5'5.4) = 302,702,400. But
(4.3. 185.4) (a)?(2b)3(—3c)?(2d)9
(5)* = (5.5185 4)(1)7(2)3(—3)7(2)9
(5) (a2 bed?) =
435,891,456,000,000 a*b3c7a?.
24 Chapter 1 Fundamental Principles of Counting
6. Ifn is a positive integer and n > 1, prove that (5) + (” > ')
EXERCISES 1.3 is a perfect square.
7. Acommittee of 12 is to be selected from 10 men and 10
1. Calculate (5) and check your answer by listing all the se-
women. In how many ways can the selection be carried out if
lections of size 2 that can be made from the letters a, b, c, d, e,
(a) there are no restrictions? (b) there must be six men and six
and f.
women? (c) there must be an even number of women? (d) there
2. Facing a four-hour bus trip back to college, Diane decides to must be more women than men? (e) there must be at least eight
take along five magazines from the 12 that her sister Ann Marie men?
has recently acquired. In how many ways can Diane make her 8. In how many ways can a gambler draw five cards from a
selection? standard deck and get (a) a flush (five cards of the same suit)?
(b) four aces? (c) four of a kind? (d) three aces and two jacks?
3. Evaluate each of the following. (e) three aces and a pair? (f) a full house (three of a kind anda
a) C(10,4) —b) (7) ~—e) C14, 12) _— dd) (18) pair)? (g) three of a kind? (h) two pairs?
4. In the Braille system a symbol, such as a lowercase letter, 9, How many bytes contain (a) exactly two 1’s; (b) exactly
punctuation mark, suffix, and so on, is given by raising at least four 1’s; (c) exactly six 1’s; (d) at least six 1’s?
one of the dots in the six-dot arrangement shown in part (a) of 10. How many ways are there to pick a five-person basketball
Fig. 1.7. (The six Braille positions are labeled in this part of team from 12 possible players? How many selections include
the figure.) For example, in part (b) of the figure the dots in the weakest and the strongest players?
positions | and 4 are raised and this six-dot arrangement repre-
11. Astudent is to answer seven out of 10 questions on an exam-
sents the letter c. In parts (c) and (d) of the figure we have the
ination. In how many ways can he make his selection if (a) there
representations for the letters m and t, respectively. The definite
are no restrictions? (b) he must answer the first two questions?
atticle “the” is shown in part (e) of the figure, while part (f)
(c) he must answer at least four of the first six questions?
contains the form for the suffix “ow.” Finally, the semicolon,
;, 1S given by the six-dot arrangement in part (g), where the dots 12. In how many ways can 12 different books be distributed
at positions 2 and 3 are raised. among four children so that (a) each child gets three books?
(b) the two oldest children get four books each and the two
youngest get two books each?
1° °4 e @ e @ - ©@ 13. How many arrangements of the letters in MISSISSIPPI
have no consecutive S’s?
2 e e 5 ° ° . ° @ e
14, A gym coach must select 11 seniors to play on a football
3 ° . 6 e ° @ ° @ °
team. If he can make his selection in 12,376 ways, how many
seniors are eligible to play?
15. a) Fifteen points, no three of which are collinear, are given
(a) (b) "c () "m (d) "t on a plane. How many lines do they determine?
- @ + @ . ° b) Twenty-five points, no four of which are coplanar, are
given in space. How many triangles do they determine?
@ - e@ - e - How many planes? How many tetrahedra (pyramidlike
solids with four triangular faces)?
e.6.°8 - @ @ -:
16. Determine the value of each of the following summations.
6 2 10
(e) “the” j(f) “ow” |(g) a WH) db) VP -D O DU +CDI
:=] j=n-2 1=0
Figure 1.7 2n
d) Yor 1)‘, where n is an odd positive integer
a) How many different symbols can we represent in the k=n
Braille system?
b) How many symbols have exactly three raised dots? e) So i(-1
1=1]
c) How many symbols have an even number of raised dots? 17. Express each of the following using the summation (or
5. a) How many permutations of size 3 can one produce with Sigma) notation. In parts (a), (d), and (e), 2 denotes a positive
the letters m, r, a, f, and t? integer,
b) List all the combinations of size 3 that result for the l | l 1
Matatgatots: n>2
letters m, r, a, f, and t.
1.3 Combinations: The Binomial Theorem 25
b) 1+44+94
164+ 25+ 364+ 49 21. How many triangles are determined by the vertices of a
ce) F-24337 -445-647 regular polygon of n sides? How many if no side of the polygon
1 2 3 n+1 is to be a side of any triangle?
d) —-+ —— + —— +...
sata tae 2n 22. a) In the complete expansion of (a+b+e+4+d)-
n+l n+2 n+3 (e+ f+tgtA\(utvu+w+x«+y+z) one obtains the
on (“')+( rt )-("e)+ sum of terms such as agw, cfx, and dgv. How many such
terms appear in this complete expansion?
+o (Sa)
2n
b) Which of the following terms do not appear in the com-
plete expansion from part (a)?
18. For the strings of length 10 in Example 1.24, how many
i) afx li) bux iii) chz
have (a) four 0’s, three 1’s, and three 2’s; (b) at least eight 1’s;
iv) egw v) egu vi) dfz
(c) weight 4?
23. Determine the coefficient of x°y* in the expansions of
19, Consider the collection of all strings of length 10 made up (a) (x + y)"*, (b) ( + 2y)”, and (c) (2x — 3y)!*.
from the alphabet 0, 1, 2, and 3. How many of these strings 24. Complete the details in the proof of the multinomial
have weight 3? How many have weight 4? How many have theorem.
even weight?
25. Determine the coefficient of
20. In the three parts of Fig. 1.8, eight points are equally spaced a) xyz7in(x+y+z)4
and marked on the circumference of a given circle. b) xyz? in(w+x+y+2z)4
c) xyz” in (2x ~— y — z)*
d) xyz? in (x — 2y + 3z7')*
e) wx? yz? in (2w — x + 3y — 2z)8
26. Find the coefficient of w7x?y*z? in the expansion of
(a) (wtetytz4t1)", (b) Qw—x+3y+z2—2)", and
(c)(v+w—2x+y+5z24+3).
27. Determine the sum of all the coefficients in the expan-
sions of
a) (x + y)? b) (x + y)"° c) (x t+y+4+z)'°
d) (wtx+y+4+z)
e) (25 — 34+ 5u + 6v — llw + 3x + 2y)!°
28. For any positive integer n determine
na 1 fn (~ 1)'
a —___— b —_——.
) » il(n — i)! ) » it(n — i)!
2") =me0(2)
29. Show that for all positive integers m and n,
m+n m+n
Figure 1.8
()eaC)eaC)eove2Q)onr2()
30. With 1 a positive integer, evaluate the sum
a) For parts (a) and (b) of Fig. 1.8 we have two different
(though congruent) triangles. These two triangles (distin-
guished by their vertices) result from two selections of size
31. For x areal number and » a positive integer, show that
3 from the vertices A, B, C, D, E, F, G, H. How many dif-
ferent (whether congruent or not) triangles can we inscribe a) ~
l=(1+x)" n A
— (1) 1
(1 + x) a~t
in the circle in this way?
4 (5)2u + xy? eee (—1)"("x
b) How many different quadrilaterals can we inscribe in the
2 n
circle, using the marked vertices? [One such quadrilateral
appears in part (c) of Fig. 1.8.] b) 1 =(24x)"— (‘es +)Q+x)""!
c) How many different polygons of three or more sides can
we inscribe in the given circle by using three or more of the + (5) (e+ IP Q+xy? e+ cir("Yos +1)"
marked vertices?
26 Chapter 1 Fundamental Principles of Counting
c) = (24+.x)" _ (7)x10 4x)! b) Given a list— ao, a|, 42,..., a, — of n+1. real
1 numbers, where n is a positive integer, determine
+ (5)xe i cay (")x" wG@ -4,.1).
2 Rr
c) Determine the value of )°!% (4, - -4).
32. . Determine x if }7>°,OG?(°)8' = x! , 34. a) Write a computer program (or eT develop an algorithm)
33. oe a, &, 4, a 1s a list of four real numbers, what is that lists all selections of size 2 from the objects 1, 2, 3, 4,
p= ~ G1)? 5, 6.
b) Repeat part (a) for selections of size 3.
1.4
Combinations with Repetition
When repetitions are allowed, we have seen that for » distinct objects an arrangement of
size r of these objects can be obtained in n” ways, for an integer r > 0. We now turn to
the comparable problem for combinations and once again obtain a related problem whose
solution follows from our previous enumeration principles.
On their way home from track practice, seven high school freshmen stop at a restaurant,
EXAMPLE 1.28
where each of them has one of the following: a cheeseburger, a hot dog, a taco, or a fish sand-
wich. How many different purchases are possible (from the viewpoint of the restaurant)?
Let c, h, t, and f represent cheeseburger, hot dog, taco, and fish sandwich, respectively.
Here we are concerned with how many of each item are purchased, not with the order
in which they are purchased, so the problem is one of selections, or combinations, with
repetition.
In Table 1.6 we list some possible purchases in column (a) and another means of repre-
senting each purchase in column (b).
Table 1.6
l. c,c,h,h,t,tf l xx |xx|[xx
[x
2. c,c,c,c,h,tf 2. XXxXx|x|x{
{x
3. ¢,¢,c,c,¢,¢,f 3. xxxxxx|||x
4. h,t,t,
f, f, f, f 4. |x|xx|]xxxx
5. t,t, t,t, t, f, f 5. |}xXxxxx|xx
6. ttt ttt t 6. ||xXxxxxxx|
7. £,f, f, f, f, f, f 7. ||| xxxxxxx
(a) (b)
For a purchase in column (b) of Table 1.6 we realize that each x to the left of the first bar
(| ) represents ac, each x between the first and second bars represents an h, the x’s between
the second and third bars stand for t’s, and each x to the right of the third bar stands for
an f. The third purchase, for example, has three consecutive bars because no one bought
a hot dog or taco; the bar at the start of the fourth purchase indicates that there were no
cheeseburgers in that purchase.
Once again a correspondence has been established between two collections of objects,
where we know how to count the number in one collection. For the representations in
1.4 Combinations with Repetition 27
column (b) of Table 1.6, we are enumerating all arrangements of 10 symbols consisting
of seven x’s and three |’s, so by our correspondence the number of different purchases for
column (a) is
10! 10
7! 3! 7}
In this example we note that the seven x’s (one for each freshman) correspond to the size
of the selection and that the three bars are needed to separate the 3 + 1 = 4 possible food
items that can be chosen.
When we wish to select, with repetition, r of n distinct objects, we find (as in Table 1.6)
that we are considering all arrangements of r x’s andn — 1 {’s and that their mimber is
(n+r—1)! =("tro")
ri(n — 1)! r .
Consequently, the number of combinations of n objects taken r at a time, with repetition,
isC{(n+r-~-,r).
(In Example 1.28, n = 4, r = 7, so it is possible for r to exceed n when repetitions are
allowed.)
A donut shop offers 20 kinds of donuts. Assuming that there are at least a dozen of each kind
EXAMPLE 1.29
when we enter the shop, we can select a dozen donuts in C(20 + 12 — 1, 12) = C(31, 12) =
141,120,525 ways. (Here n = 20, r = 12.)
President Helen has four vice presidents: (1) Betty, (2) Goldie, (3) Mary Lou, and (4) Mona.
EXAMPLE 1.30
She wishes to distribute among them $1000 in Christmas bonus checks, where each check
will be written for a multiple of $100.
a) Allowing the situation in which one or more of the vice presidents get nothing,
President Helen is making a selection of size 10 (one for each unit of $100) from
a collection of size 4 (four vice presidents), with repetition. This can be done in
C(4+ 10 — 1, 10) = C(13, 10) = 286 ways.
b) If there are to be no hard feelings, each vice president should receive at least $100. With
this restriction, President Helen is now faced with making a selection of size 6 (the
remaining six units of $100) from the same collection of size 4, and the choices now
number C(4 + 6 — 1, 6) = C(9, 6) = 84. [For example, here the selection 2, 3, 3, 4,
4, 4 is interpreted as follows: Betty does not get anything extra— for there is no | in
the selection. The one 2 in the selection indicates that Goldie gets an additional $100.
Mary Lou receives an additional $200 ($100 for each of the two 3’s in the selection).
Due to the three 4’s, Mona’s bonus check will total $100 + 3($100) = $400.]
28 Chapter 1 Fundamental Principles of Counting
c) If each vice president must get at least $100 and Mona, as executive vice president,
gets at least $500, then the number of ways President Helen can distribute the bonus
checks is
c342-1,2)+C3+1-1,1)+C3+0-1,0)=10=C(4+2-1,2)
~ \ ~- - en
Mona gets Mona gets Mona gets Using the
exactly $500 exactly $600 exactly $700 technique in part (b)
Having worked examples utilizing combinations with repetition, we now consider two
examples involving other counting principles as well.
In how many ways can we distribute seven bananas and six oranges among four children
EXAMPLE 1.31
so that each child receives at least one banana?
After giving each child one banana, consider the number of ways the remaining three
bananas can be distributed among these four children. Table 1.7 shows four of the distri-
butions we are considering here. For example, the second distribution in part (a) of Ta-
ble 1.7 —namely, 1, 3, 3—indicates that we have given the first child (designated by 1)
one additional banana and the third child (designated by 3) two additional bananas. The
corresponding arrangement in part (b) of Table 1.7 represents this distribution in terms of
three b’s and three bars. These six symbols — three of one type (the b’s) and three others of a
second type (the bars) — can be arranged in 6!/(3! 3!) = C(6, 3) = C(444+3 — 1, 3) = 20
ways. [Here n = 4, r = 3.] Consequently, there are 20 ways in which we can distribute
the three additional bananas among these four children. Table 1.8 provides the compa-
rable situation for distributing the six oranges. In this case we are arranging nine sym-
bols — six of one typé (the o’s) and three of a second type (the bars). So now we learn
that the number of ways we can distribute the six oranges among these four children is
91/(6! 3!) = C(O, 6) = C(44+6— 1, 6) = 84 ways. [Heren = 4,r = 6.] Therefore, by the
rule of product, there are 20 X 84 = 1680 ways to distribute the fruit under the stated
conditions.
Table 1.7 Table 1.8
1) 1,2,3 1) bl bib 1) 1,2,2,3,3,4 1) olooloo|o
2) 1,3,3 2) b| |bb| 2) 1,2,2,4,4,4 2) oloo||ooo
3) 3,4,4 3) ||b|bb 3) 2,2, 2,3,3,3 3) looolooeo|
4) 4,4,4 4) |||bbb 4) 4,4,4,4,4,4 4) |||eoo000
(a) (b) (a) (b)
A message is made up of 12 different symbols and is to be transmitted through a com-
EXAMPLE 1.32
munication channel. In addition to the 12 symbols, the transmitter will also send a total
of 45 (blank) spaces between the symbols, with at least three spaces between each pair of
consecutive symbols. In how many ways can the transmitter send such a message?
There are | 2! ways to arrange the 12 different symbols, and for each of these arrangements
there are 11 positions between the 12 symbols. Because there must be at least three spaces
between successive symbols, we use up 33 of the 45 spaces and must now locate the
remaining |2 spaces. This is now a selection, with repetition, of size 12 (the spaces) from a
collection of size 11 (the locations), and this can be accomplished in C(11 + 12 — 1, 12) =
646,646 ways.
1.4 Combinations with Repetition 29
Consequently, by the rule of product the transmitter can send such messages with the
required spacing in (12!)(75) = 3.097 x 10!* ways.
In the next example an idea is introduced that appears to have more to do with number
theory than with combinations or arrangements. Nonetheless, the solution of this example
will turn out to be equivalent to counting combinations with repetitions.
Determine all integer solutions to the equation
EXAMPLE 1.33
Xp t¢x24+43 444 = 7, where
x, > 0 forall]
<i <4.
One solution of the equation is x; = 3, x2 = 3, x3 = 0, x4 = 1. (This is different from a
solution such as x; = 1,x2 = 0,x3 = 3,x4 = 3,even though the same four integers are being
used.) A possible interpretation for the solution x} = 3, x2 = 3,x3 = 0, x4 = 1 is that we are
distributing seven pennies (identical objects) among four children (distinct containers), and
here we have given three pennies to each of the first two children, nothing to the third child,
and the last penny to the fourth child. Continuing with this interpretation, we see that each
nonnegative integer solution of the equation corresponds to a selection, with repetition, of
size 7 (the identical pennies) from a collection of size 4 (the distinct children), so there are
C(4+7 —-1, 7) = 120 solutions.
At this point it is crucial that we recognize the equivalence of the following: _ Ae
a) The number of integer solutions of the equation |
Xp txts +x, =P, xj > 0, i sis it. . a
b) The number of selections, with repetition, of size r from a collection of size n.°
c) The number of ways r identical objects can be distributed among x distinct
containers.
In terms of distributions, part (c) is valid only when the r objects being distributed are
identical and the » containers are distinct. When both the r objects and the n containers
are distinct, we can select any of the n containers for each one of the objects and get n’”
distributions by the rule of product.
When the objects are distinct but the containers are identical, we shall solve the problem
using the Stirling numbers of the second kind (Chapter 5). For the final case, in which both
objects and containers are identical, the theory of partitions of integers (Chapter 9) will
provide some necessary results.
In how many ways can one distribute 10 (identical) white marbles among six distinct
EXAMPLE 1.34
containers?
Solving this problem is equivalent to finding the number of nonnegative integer solutions
to the equation x; + x2 +--++ x6 = 10. That number is the number of selections of size 10,
with repetition, from a collection of size 6. Hence the answer is C(6 + 10 — 1, 10) = 3003.
We now examine two other examples related to the theme of this section.
30 Chapter 1 Fundamental Principles of Counting
From Example 1.34 we know that there are 3003 nonnegative integer solutions to the
EXAMPLE 1.35
equation x; + x2 +---+ x6 = 10. How many such solutions are there to the inequality
X, $x2+---+x6< 10?
One approach that may seem feasible in dealing with this inequality is to determine
the number of such solutions to x; + x2 +---+.x%6 =k, where k is an integer and 0 <
k <9. Although feasible now, the technique becomes unrealistic if 10 is replaced by a
somewhat larger number, say 100. In Example 3.12 of Chapter 3, however, we shall estab-
lish a combinatorial identity that will help us obtain an alternative solution to the problem
by using this approach.
For the present we transform the problem by noting the correspondence between the
nonnegative integer solutions of
Xp +x2+---+ 4x6 < 10 (1)
and the integer solutions of
Xytxo +--+ +x6 + x7 = 10, 0 < x;, 1<i <6, 0 < x7. (2)
The number of solutions of Eq. (2) is the same as the number of nonnegative integer
solutions of yj + yo +:-:+y6 + y7 = 9, where y; = x, for 1 <i <6, and yj = x7 - 1.
This is C(7 + 9 — 1, 9) = 5005.
Our next result takes us back to the binomial and multinomial expansions.
In the binomial expansion for (x + y)", each term is of the form (i )x* tk so the total
EXAMPLE 1.36
number of terms in the expansion is the number of nonnegative integer solutions of n; +
ny = n(n, is the exponent for x, m2 the exponent for y). This number is C(2 +n — 1, n) =
n+l,
Perhaps it seems that we have used a rather long-winded argument to get this result.
Many of us would probably be willing to believe the result on the basis of our experiences
in expanding (x + y)” for various small values of x.
Although experience is worthwhile in pattern recognition, it is not always enough to find
a general principle. Here it would prove of little value if we wanted to know how many
terms there are in the expansion of (w + x + y +z)!
Each distinct term here is of the form (,.,,",,,,)w™x?y™z™, where 0 <n, for
1<i <4,andn,; +n2 +73 +74 = 10. This last equation can be solved in C(4+ 10 — 1,
10) = 286 ways, so there are 286 terms in the expansion of (w + x + y +z)!®.
And now once again the binomial expansion will come into play, as we find ourselves
using part (a) of Corollary 1.1
a) Let us determine all the different ways in which we can write the number 4 as a sum
EXAMPLE 1.37
of positive integers, where the order of the summands is considered relevant. These
representations are called the compositions of 4 and may be listed as follows:
1) 4 5)2+1+1
2)3+4+1 6) 14+2+1
3) 1+3 7) 1+1+4+2
4)2+2 8) 1+14+1+1
1.4 Combinations with Repetition 31
Here we include the sum consisting of only one summand — namely, 4. We find that
for the number 4 there are eight compositions in total. (If we do not care about the order
of the summands, then the representations in (2) and (3) are no longer considered to be
different — nor are the representations in (5), (6), and (7). Under these circumstances
we find that there are five partitions for the number 4— namely, 4; 3 + 1; 2 +2;
2+1+1;and1+1+1 +1. We shall learn more about partitions of positive integers
in Section 9.3.)
b) Now suppose that we wish to count the number of compositions for the number 7.
Here we do not want to list all of the possibilities — which include 7; 6 + 1; 1+6;
$+2;14+2+4,2+4+1; and3+1+2-+1. To count all of these compositions,
let us consider the number of possible summands.
i) For one summand there is only one composition—— namely, 7.
ii) If there are two (positive) summands, we want to count the number of integer
solutions for
w,
+ uw. =7, where wy), Wo > 0.
This is equal to the number of integer solutions for
xX, $x. =5, where
x}, X2 > 0.
The number of such solutions is (7*2~ ') = (8).
iii) Continuing with our next case, we examine the compositions with three (positive)
summands. So now we want to count the number of positive integer solutions for
yityot+y3=7.
This is equal to the number of nonnegative integer solutions for
Zi +22 +23 =4,
and that number is C + ~ ') = (8).
We summarize cases (1), (ii), and (iii), and the other four cases in Table 1.9, where we
recall for case (i) that 1 = (¢).
Table 1.9
n = The Number of Summands | The Number of Compositions
in a Composition of 7 of 7 with n Summands
(1) n=] (i) (@)
(ii) n=2 (ii) ( 65
eee”
(iii) n=3 (iii) (
ee
(iv) n=4 (iv) (§
ee
(v) n=5 (v) (5
Nee
(vi) n=6 (vi) (°
Ne
(vii) n=7 (vii) (5
Newer
32 Chapter 1 Fundamental Principles of Counting
Consequently, the results from the right-hand side of our table tell us that the (total)
number of compositions of 7 is
()+()+@) OQ) +@)-E0)
From part (a) of Corollary 1.1 this reduces to 2°. |
In general, one finds that for each positive integer m, there are )>7=J (", ') =2"-!
compositions.
From Example 1.37 we know that there are 2'?~' = 2'! = 2048 compositions of 12. If
EXAMPLE 1.38
our interest is in those compositions where each summand is even, then we consider, for
instance, compositions such as
2+4+6=2(14+2+3) 2+8+2=20+4+1)
84+2+2=2(4+1+4+1) 6+ 6 = 2(3 + 3).
In each of these four examples, the parenthesized expression is a composition of 6. This
observation indicates that the number of compositions of 12, where each summand is even,
equals the number of (all) compositions of 6, which is 2°-'! = 2° = 32.
Our next two examples provide applications from the area of computer science. Further-
more, the second example will lead to an important summation formula that we shall use
in many later chapters.
Consider the following program segment, where i, 7, and & are integer variables.
EXAMPLE 1.39
fori :=1 to 20 do
for j :=1toido
for k :=1tojdo
print (i* 7 +k)
How many times is the print statement executed in this program segment?
Among the possible choices fori, j, and & (in the order i—first, ;-second, k—third) that
will lead to execution of the print statement, we list (1) 1, 1, 1; (2) 2, 1, 1; (3) 15, 10, 1;
and (4) 15, 10, 7. We note that 7 = 10, 7 = 12, k =5 is not one of the selections to be
considered, because j = 12 > 10 =i; this violates the condition set forth in the second
for loop. Each of the above four selections where the print statement is executed satisfies
the condition 1 <k <j <i < 20. In fact, any selection a, b, c (a <b <c) of size 3, with
repetitions allowed, from the list 1, 2, 3, ..., 20 results in one of the correct selections:
here, k = a, j = b,i = c. Consequently the print statement is executed
20 —] 22
( ) = (5) = 1540 times.
If there had been r (> 1) for loops instead of three, the print statement would have been
executed (7°+” ~ ') times.
Here we use a program segment to derive a summation formula. In this program segment,
EXAMPLE 1.40
the variables i, 7, n, and counter are integer variables. Furthermore, we assume that the
value of n has been set prior to this segment.
1.4 Combinations with Repetition 33
counter :=0
for i:=1tondo
for j :=1toido
counter := counter+1
From the results in Example 1.39, after this segment is executed the value of (the variable)
counter will be (" +3 7 ') = (" 3 '). (This is also the number of times that the statement
(*) counter := counteri+1
is executed.)
This result can also be obtained as follows: When i := 1, then j varies from 1 to 1 and
(*) is executed once; when i is assigned the value 2, then j varies from | to 2 and (*) is
executed twice; j varies from | to 3 when i is assigned the value 3, and (*) is executed three
times; in general, for 1 < k <n, wheni := k, then / varies from | to k and (*) is executed
k times. In total, the variable counter is incremented [and the statement (*) is executed]
1+2+3+---+n times.
Consequently,
1 1
Die te reste tan ("F
ff
)- )
i=]
2
The derivation of this summation formula, obtained by counting the same result in two
different ways, constitutes a combinatorial proof.
Our last example for this section introduces the idea of a run, a notion that arises in
statistics —in particular, in the detecting of trends in a statistical process.
The counter at Patti and Terri’s Bar has 15 bar stools. Upon entering the bar Darrell finds
EXAMPLE 1.41
the stools occupied as follows:
OOEQOOQOOQOOEEEOOOE
O,
where O indicates an occupied stool and E an empty one. (Here we are not concerned with
the occupants of the stools, just whether or not a stool is occupied.) In this case we say that
the occupancy of the 15 stools determines seven runs, as shown:
O00, E OOOO EEE OOO E OO
Se ee ee eer ee
Run = Run Run Run Run Run = Run
In general, a run is a consecutive list of identical entries that are preceded and followed by
different entries or no entries at all.
A second way in which five E’s and 10 O’s can be arranged to provide seven runs is
EQOOQOQEEQQEOQOQOOOE.
We want to find the total number of ways five E’s and 10 O’s can determine seven runs.
If the first run starts with an E, then there must be four runs of E’s and three runs of O’s.
Consequently, the last run must end with an E.
Let x; count the number of E’s in the first run, x2. the number of O’s in the second run,
x3 the number of E’s in the third run, ... , and x7 the number of E’s in the seventh run. We
want to find the number of integer solutions for
X) +x3 4x5 +x7 =5, X1,X3,X5,X7 > 0 (3)
34 Chapter 1 Fundamental Principles of Counting
and
X2+x4+x6 = 10, X2,%4, %6 > Q. (4)
The number of integer solutions for Eq. (3) equals the number of integer solutions for
yitystyst+y7
= 1, Yi. ¥3, Ys, ¥7 =O.
This number is (¢+ t —t ) = (7) = 4. Similarly, for Eq. (4), the number of solutions is
C + ; 7 ') = (5) = 36. Consequently, by the rule of product there are 4 - 36 = 144 arrange-
ments of five E’s and 10 O’s that determine seven runs, the first run starting with E.
The seven runs may also have the first run starting with an O and the last run ending
with an O. So now let w, count the number of O’s in the first run, w>2 the number of E’s in
the second run, w3 the number of O’s in the third run, .. . , and w7 the number of O’s in the
seventh run. Here we want the number of integer solutions for
w, + w3+ ws + w7 = 10, Wy), W3, Ws, W7 > O
and
WwW. + ws + we = 5, W2, W4, We > O.
Arguing as above, we find that the number of ways to arrange five E’s and 10 O’s, resulting
in seven runs where the first run starts with an O, is (
tre
6
Nets)
2) = (6)(2) = 504.
Consequently, by the rule of sum, the five E’s and 10 O’s can be arranged in 144 + 504 =
648 ways to produce seven runs.
6. Answer Example 1.32, where the 12 symbols being trans-
(3 (eh SR mitted are four A’s, four B’s, and four C’s.
7. Determine the number of integer solutions of
1. In how many ways can 10 (identical) dimes be distributed
among five children if (a) there are no restrictions? (b) each Xp + Xp + x3 4+ X4 = 32,
child gets at least one dime? (c) the oldest child gets at least two where
dimes?
a)x,>0, 1<i<4 b) x, >0, I1<i<4
2. In how many ways can 15 (identical) candy bars be dis-
tributed among five children so that the youngest gets only one C) x1,
%2 25, x3,X4 27
or two of them? d)x,>8, 1l<i<4 e)x,>—2, 1<i<4
3. Determine how many ways 20 coins can be selected from f) x1, %2,%3>0, O< x4 <25
four large containers filled with pennies, nickels, dimes, and 8. In how many ways can a teacher distribute eight chocolate
quarters. (Each container is filled with only one type of coin.) donuts and seven jelly donuts among three student helpers if
4. Acertain ice cream store has 31 flavors of ice cream avail- each helper wants at least one donut of each kind?
able. In how many ways can we order a dozen ice cream cones 9. Columba has two dozen each of different colored beads.
if (a) we do not want the same flavor more than once? (b) a If she can select 20 beads (with repetitions of colors allowed)
flavor may be ordered as many as 12 times? (c) a flavor may be in 230,230 ways, what is the value of ?
ordered no more than 11 times?
10. In how many ways can Lisa toss 100 (identical) dice so that
5. a) In how many ways can we select five coins from a col- at least three of each type of face will be showing?
lection of 10 consisting of one penny, one nickel, one dime,
11. Two n-digit integers (leading zeros allowed) are considered
one quarter, one half-dollar, and five (identical) Susan B.
equivalent if one is a rearrangement of the other. (For example,
Anthony dollars?
12033, 20331, and 01332 are considered equivalent five-digit
b) In how many ways can we select n objects from a col- integers.) (a) How many five-digit integers are not equivalent?
lection of size 2 that consists of n distinct and n identical (b) If the digits 1, 3, and 7 can appear at most once, how many
objects? nonequivalent five-digit integers are there?
1.4 Combinations with Repetition 35
12. Determine the number of integer solutions for increment :=0
X, $x. +43
+.x4 + x5 < 40, sum :=0
for i :=1to10do
where
for j :=1toido
a)x,>0, I1<i<S fork :=1tojdo
b) x,
> -3, 1<i<5 begin
increment := increment +1
13. In how many ways can we distribute eight identical white
sum := sum+ increment
balls into four distinct containers so that (a) no container is
end
left empty? (b) the fourth container has an odd number of balls
in it? 22. Consider the following program segment, where /, j, k,n,
14, a) Find the coefficient of v?w*xz in the expansion of and counter are integer variables and the value of n (a positive
GBv+t2w+tx+ty+z)?. integer) is set prior to this segment.
b) How many distinct terms arise in the expansion in counter :=0
part (a)? for i:=1tondo
15. In how many ways can Beth place 24 different books on for j :=ltoido
four shelves so that there is at least one book on each shelf? (For fork :=1tojdo
any of these arrangements consider the books on each shelf to counter := counter¢+1
be placed one next to the other, with the first book at the left of
the shelf.) We shall determine, in two different ways, the number of times
the statement
16. For which positive integer n will the equations
(1) X, +xX2 + x3 +---4 X19 =H, and
counter := counter+1
(2) yityotyst--->+
Yea = is executed. (This is also the value of counter after execution
have the same number of positive integer solutions? of the program segment.) From the result in Example 1.39, we
know that the statement is executed ("~}~ ') = ("}*) times.
17, How many ways are there to place 12 marbles of the same
For a fixed value of i, the for loops involving j and k result
size in five distinct jars if (a) the marbles are all black? (b) each
in (' 3 2) executions of the counter increment statement. Conse-
marble is a different color?
quently, ("37) = }°*_, ('4'). Use this result to obtain a sum-
18. a) How many nonnegative integer solutions are there mation formula for
to the pair of equations x) +x. +.%3+--:+x7 = 37,
xX] + xX + x3 6? P4P4P4-.-4W~ 5507.
i=]
b) How many solutions in part (a) have x;, x, x; > 0?
23. a) Given positive integers m,n with m > n, show that the
19. How many times is the print statement executed for the
number of ways to distribute m identical objects into n dis-
following program segment? (Here, i, /, k, and m are integer
tinct containers with no container left empty is
variables.)
C(m—-—1,m—n)=C(m—1,n—-1).
for i :=1to20
do
forj :=1toido b) Show that the number of distributions in part (a) where
fork :=1tojdo each container holds at least r objects (m > nr) is
form:=1tokdo C(m—14+(1—r)a,n—-1).
print (i
* j) + (kK
* m)
24, Write a computer program (or develop an algorithm) to list
20. In the following program segment, i, 7, k, and counter are the integer solutions for
integer variables. Determine the value that the variable counter a) x; tx2+4%3=10, O<x, 1<i<3
will have after the segment is executed. b) x; +X. +%3 +24 = 4, -2<%,, 1<i<4
counter := 10 25. Consider the 2'? compositions of 20. (a) How many have
for i:=1to1l15do each summand even? (b) How many have each summand a
for j :=itoi15do multiple of 4?
for k := 7 to 15 do
26. Let n, m, k be positive integers with »n = mk. How many
counter := counter+1
compositions of # have each summand a multiple of k?
21. Find the value of sum after the given program segment is 27, Frannie tosses a coin 12 times and gets five heads and seven
executed. (Here i, j, k, increment, and sum are integer vari- tails. In how many ways can these tosses result in (a) two runs
ables.) of heads and one run of tails; (b) three runs; (c) four runs;
36 Chapter 1 Fundamental Principles of Counting
(d) five runs; (e) six runs; and (f) equal numbers of runs of b) For n > 6, how many strings of # 0’s and 1’s contain
heads and runs of tails? (exactly) three occurrences of 01?
28. a) Forn > 4, consider the strings made up of n bits — that c) Provide a combinatorial proof for the following:
is, a total of n 0’s and 1’s. In particular, consider those Forn > 1,
strings where there are (exactly) two occurrences of 01. n+ n+ (" + ‘), n odd
For example, if n = 6 we want to include strings such as 2" = | + 3 teeta,
010010 and 100101, but not 101111 or 010101. How many . (rt tJ» A even,
such strings are there?
15
The Catalan Numbers (Optional)
In this section a very prominent sequence of numbers is introduced. This sequence arises in
a wide variety of combinatorial situations. We'll begin by examining one specific instance
where it is found.
Let us start at the point (0, 0) in the xy-plane and consider two kinds of moves:
EXAMPLE 1.42
R: (x, y) > (x + 1, y) U: (x, y) > @, y+ 1).
We want to know how we can move from (0, 0) to (5, 5) using such moves — one unit to
the right or one unit up. So we’ ll need five R’s and five U’s. At this point we have a situation
like that in Example 1.14, so we know there are 10!/(5! 5!) = (12) such paths. But now
we ll add a twist! In going from (0, 0) to (5, 5) one may touch but never rise above the line
y = x. Consequently, we want to include paths such as those shown in parts (a) and (b) of
Fig. 1.9 but not the path shown in part (c).
The first thing that is evident is that each such arrangement of five R’s and five U’s must
start with an R (and end with a U). Then as we move across this type of arrangement—
going from left to right — the number of R’s at any point must equal or exceed the number
of U’s. Note how this happens in parts (a) and (b) of Fig. 1.9 but not in part (c). Now we
can solve the problem at hand if we can count the paths [like the one in part (c)] that go
from (0, 0) to (5, 5) but rise above the line y = x. Look again at the path in part (c) of
Fig. 1.9. Where does the situation there break down for the first time? After all, we start
with the requisite R — then follow it by a U. So far, so good! But then there is a second U
and, at this (first) time, the number of U’s exceeds the number of R’s.
Now let us consider the following transformation:
R, U,U, | U,R,R,R,U,U,R @ R,U,U, | R,U,U,U,R,R,
U.
What have we done here? For the path on the left-hand side of the transformation, we
located the first move (the second U) where the path rose above the line y = x. The moves
up to and including this move (the second U) remain as is, but the moves that follow are
interchanged — each U is replaced by an R and each R by a U. The result is the path on
the right-hand side of the transformation — an arrangement of four R’s and six U’s, as seen
in part (d) of Fig. 1.9. Part (e) of that figure provides another path to be avoided; part (f)
shows what happens when this path is transformed by the method described above. Now
suppose we start with an arrangement of six U’s and four R’s, say
R, U,R,R, U, U, U, | U,U,R.
15 The Catalan Numbers (Optional) 37
y‘ y _ xf ; ys y = a7 | y' y - xf
° (AIS. 5) a (a (7A (5, 5)
7 7 | 7
4 7 4 7 | 4
7 7 7
7 4 4
3 f 3 4
3 4
2
, o
2 7 2 7
rs ov
1 of 1 4 1 7
4 7 7
ov f 7
» X >» X ~ xX
3 4 5 1 2 3 4 5 1 2 3 4 5
R,U,R,R,U,R,R,U,U,U R,R,U,U,R,U,R,R,U,U R,U,U,U,R,R,R,U,U,R
(a) (b) (c)
,
y
(4, 6) | y
(4, 6)
6 Y= x y y=x 6 = xX
5 yo
< 5 t yo
“
(5, 5)
5 Yo
o
7 4 4
4 4
7 4 / |
4 7
7
7 4 : e
; 3 /
3 f 3
v | 7
4 7
Z 2 7
2 2
f i 7
re Ye
1 4
7 1 4
y | 1 7
7
7 7 7
> xX > X > X
3 4 5 1 2 3 4 5 1 2 3 4 5
R,U,U,R,U,U,U,R,R,U U,U,R,U,R,R,R,U,R,U U,R,U,R,U,U,U,R,U,R
(d) (e) (f)
Figure 1.9
Focus on the first place where the number of U’s exceeds the number of R’s. Here it is in
the seventh position, the location of the fourth U. This arrangement is now transformed
as follows: The moves up to and including the fourth U remain as they are; the last three
moves are interchanged — each U is replaced by an R, each R by a U. This results in the
arrangement
R, U,R,R,U, U,U, + R,R,U.
—one of the bad arrangements (of five R’s and five U’s) we wish to avoid as we go from
(0, 0) to (5, 5). The correspondence established by these transformations gives us a way
to count the number of bad arrangements. We alternatively count the number of ways to
arrange four R’s and six U’s — this is 10!/(4! 6!) = (/?). Consequently, the number of ways
to go from (0, 0) to (5, 5) without rising above the line y = x is
10\ /10\ _ 10! ~—- 10! 6(10)!
— 5(10)!
5 4) 515! 46! 615!
-(; doy 1 f/W0)_ 2-5)
3) (as) -aen(s) sipls)>
38 Chapter 1 Fundamental Principles of Counting
The above result generalizes as follows. For any integer n > 0, the number of paths
(made up of n R’s and n U’s) going from (0, 0) to (m, m), without rising above the line
y =x, is
2n 2n l 2n
b, = ~_ = . n> 1, bp
= 1.
n n—] n+l1\n
The numbers bo, b1, bo, .. .arecalled the Catalan numbers, after the Belgian mathematician
Eugéne Charles Catalan (1814-1894), who used them in determining the number of ways to
parenthesize the product x)x2x3x4 + - - x,. For instance, the five (= b3) ways to parenthesize
XyXOX3X4 ATE:
(((%1.%2)%3)
x4) (C01 (12.43) 4) (C142) (03-¥4)) (x1 ((%2%3)x4)) (01 (%2(43.44)))-
The first seven Catalan numbers are bp = 1, b} = 1, b2 = 2, b3 = 5, by = 14, bs = 42, and
be = 132.
Here are some other situations where the Catalan numbers arise. Some of these examples
EXAMPLE 1.43
are very much like the result in Example 1.42. A change in vocabulary is often the only
difference.
a) In how many ways can one arrange three 1’s and three —1’s so that all six partial
sums (starting with the first summand) are nonnegative? There are five (= b3) such
arrangements:
1,1,1,-1, -1, -1 1,1,-1, — -1,1,~1 1,~—1,1,1,~—1,
~-1
,1,~—1, ; 1 ,—1,-1
1,1 1,-1,1, -1,1, -1
In general, for n > 0, one can arrange n 1’s and n —1’s, with all 2n partial sums
nonnegative, in b, ways.
b) Given four 1’s and four 0’s, there are 14 (= by) ways to list these eight symbols so
that in each list the number of 0’s never exceeds the number of 1’s (as a list is read
from left to right). The following shows these 14 lists:
10101010 11001010 11100010
10101100 11001100 11100100
10110010 11010010 11101000
10110100 11010100
10111000 11011000 11110000
For n > 0, there are b,, such lists ofn 1’s and n 0’s.
c) Table 1.10
(((ab)c)d) (((abc 111000
((a(bc))d) ({a(be 110100
((ab)(cd)) ((ab(e 110010
(a({bc)d)) (a((be 101100
(a(b(cd))) (a(b(c 101010
Consider the first column in Table 1.10. Here we find five ways to parenthesize the
product abcd. The first of these is (((ab)c)d). Reading left to right, we list the three
occurrences of the left parenthesis “(” and the letters a, b, c— maintaining the order
in which these six symbols occur. This results in (((abc, the first expression in col-
1.5 The Catalan Numbers (Optional) 39
umn 2 of Table 1.10. Likewise, ((a(bc))d) in column 1 corresponds to ((a(be in col-
umn 2—and so on, for the other three entries in each of columns | and 2. Now one
can also go backward, from column 2 to column |. Take an expression in column 2
and append “d)” to the right end. For instance, ((ab(c becomes ((ab(cd). Reading
this new expression from left to right, we now insert a right parenthesis “)”” whenever
a product of two results arises. So, for example, ((ab(cd) becomes
((ab)(cd))
For the _t tL For the
product of product of
aand 6 (ab) and (ca)
The correspondence between the entries in columns 2 and 3 is more immediate.
For an entry in column 2 replace each “(’’ by a “1” and each letter by a “0”. Reversing
this process, we replace each “1” by a “(”, the first 0 by a, the second by b, and the
third by c. This takes us from the entries in column 3 to those in column 2.
Now consider the correspondence between columns | and 3. (This correspondence
arises from the correspondence between columns | and 2 and the one between columns
2 and 3.) It shows us that the number of ways to parenthesize the product abcd equals
the number of ways to list three 1’s and three 0’s so that, as such a list is read from left
to right, the number of |’s always equals or exceeds the number of 0’s. The number
of ways here is 5 (= 53).
In general, one can parenthesize the product x;x2x3 -- +X, in b,_, ways.
d) Let us arrange the integers 1, 2, 3, 4, 5, 6 in two rows of three so that (1) the integers
increase in value as each row is read, from left to right, and (2) in any column the
smaller integer is on top. For example, one way to do this is
1 2 4
3 5 6
Now consider three 1’s and three 0’s. Arrange these six symbols in a list so that
the 1’s are in positions 1, 2, 4 (the top row) and the 0’s are in positions 3, 5, 6 (the
bottom row). The result is 110100. Reversing the process, start with another list, say
101100 (where the number of 0’s never exceeds the number of 1’s, as the list is read
from left to right). The 1’s are in positions 1, 3, 4 and the 0’s are in positions 2, 5, 6.
This corresponds to the arrangement
1 3 4
2 5 6
which satisfies conditions (1) and (2), as stated above. From this correspondence we
learn that the number of ways to arrange 1, 2, 3, 4, 5, 6, so that conditions (1) and (2)
are satisfied, is the number of ways to arrange three 1’s and three 0’s in a list so that
as the six symbols are read, from left to right, the number of 0’s never exceeds the
number of |’s. Consequently, one can arrange 1, 2, 3, 4,5, 6 and satisfy conditions (1)
and (2) in b3 (= 5) ways.
In closing let us mention that the Catalan numbers will come up in other sections — in
particular, Section 5 of Chapter 10. Further examples can be found in reference [3] by
M. Gardner. For even more results about these numbers one should consult the references
for Chapter 10.
40 Chapter 1 Fundamental Principles of Counting
b) Find, as in Example 1.43, the way to parenthesize
abcdef that corresponds to each given list of five 1’s and
five 0’s.
1. Verify that for each integer n > 1,
i) 1110010100
Cr) (ea)
2. Determine the value of 57, bg, bo, and hyo.
ii)
iti)
1100110010
1011100100
9. Consider drawing n semicircles on and above a horizontal
3. a) In how many ways can one travel in the xy-plane from line, with no two semicircles intersecting. In parts (a) and (b)
(0, 0) to (3, 3) using the moves R: (x, y) > (x + 1, y) and of Fig. 1.10 we find the two ways this can be done for n = 2;
U: (x, y) > (x, y + 1), if the path taken may touch but the results for n = 3 are shown in parts (c)-(g).
never fall below the line y = x? In how many ways from
(0, 0) to (4, 4)?
b) Generalize the results in part (a).
c) What can one say about the first and last moves of the
paths in parts (a) and (b)?
4. Consider the moves
R: (x, y) > («+1,y) and U:(x, y) > (&, y+ 1),
as in Example 1.42. In how many ways can one go
a) from (0, 0) to (6, 6) and not rise above the line y = x?
b) from (2, 1) to (7, 6) and not rise above the line y =
x—1?
c) from (3, 8) to (10, 15) and not rise above the line
y=x+5?
5. Find the other three ways to arrange 1, 2, 3, 4, 5, 6 in two
rows of three so that the conditions in part (d) of Example 1.43
are satisfied.
6. There are b, (= 14) ways to arrange |, 2,3,..., 8 in two
Figure 1.10
rows of four so that (1) the integers increase in value as each
row is read, from left to right, and (2) in any column the smaller
integer is on top. Find, as in part (d) of Example 1.43, i) How many different drawings are there for four semi-
circles?
a) the arrangements that correspond to each of the fol-
lowing. ii) How many for any n > 0? Explain why.
i) 10110010 ii) 11001010 iii) 11101000 10. a) In how many ways can one go from (0, 0) to (7, 3) if
the only moves permitted are R: (x, y) > (x + 1, y) and
b) the lists of four 1’s and four 0’s that correspond to each
U: (x, y) > (x, y + 1), and the number of U’s may never
of these arrangements of 1, 2,3,...,8.
exceed the number of R’s along the path taken?
i 1345 ii) 1237 iii)
1 2 45
b) Let m, n be positive integers with m > n. Answer the
2678 4568 3678
question posed in part (a), upon replacing 7 by m and 3
7. In how many ways can one parenthesize the product by an.
abcdef?
11. Twelve patrons, six each with a $5 bill and the other six
8. There are 132 ways in which one can parenthesize the each with a $10 bill, are the first to arrive at a movie theater,
product abcdef g. where the price of admission is five dollars. In how many ways
a) Determine, as in part (c) of Example 1.43, the list of five can these 12 individuals (all loners) line up so that the number
1’s and five 0’s that corresponds to each of the following. with a $5 bill is never exceeded by the number with a $10 bill
i) (((ab)c)(d(ef))) (and, as a result, the ticket seller is always able to make any
ii) (a(b(e(d(ef))))) necessary change from the bills taken in from the first 11 of
iii) ((((ab)(cd))e) f) these 12 patrons)?
1.6 Summary and Historical Review 4]
1.6
Summary and Historical Review
In this first chapter we introduced the fundamentals for counting combinations, permuta-
tions, and arrangements in a large variety of problems. The breakdown of problems into
components requiring the same or different formulas for their solutions provided a key
insight into the areas of discrete and combinatorial mathematics. This is somewhat similar
to the top-down approach for developing algorithms in a structured programming lan-
guage. Here one develops the algorithm for the solution of a difficult problem by first
considering major subproblems that need to be solved. These subproblems are then further
refined — subdivided into more easily workable programming tasks. Each level of refine-
ment improves on the clarity, precision, and thoroughness of the algorithm until it is readily
translatable into the code of the programming language.
Table 1.11 summarizes the major counting formulas we have developed so far. Here
we are dealing with a collection of n distinct objects. The formulas count the number of
ways to select, or order, with or without repetitions, r of these n objects. The summaries of
Chapters 5 and 9 include other such charts that evolve as we extend our investigations into
other counting methods.
Table 1.11
Order Is | Repetitions Location
Relevant | Are Allowed | Type of Result Formula in Text
Yes No Permutation Pin,r) =ni/a—r)!, Page 7
Q<r<n
Yes Yes Arrangement n’, n,r>0 Page 7
n
No No Combination C(n,r) =nl/[ri(n —r)!] = ( ), Page 15
r
O<r<n
_]
No Yes Combination (" rr ) n,r >0 Page 27
with repetition "
As we continue to investigate further principles of enumeration, as well as discrete
mathematical structures for applications in coding theory, enumeration, optimization, and
sorting schemes in computer science, we shall rely on the fundamental ideas introduced in
this chapter.
The notion of permutation can be found in the Hebrew work Sefer Yetzirah (The Book of
Creation), a manuscript written by a mystic sometime between 200 and 600. However, even
earlier, it is of interest to note that a result of Xenocrates of Chalcedon (396-314 B.C.) may
possibly contain “the first attempt on record to solve a difficult problem in permutations
and combinations.” For further details consult page 319 of the text by T. L. Heath [4],
as well as page 113 of the article by N. L. Biggs [1], a valuable source on the history
of enumeration. The first textbook dealing with some of the material we discussed in this
chapter was Ars Conjectandi by the Swiss mathematician Jakob Bernoulli (1654—1705). The
text was published posthumously in 1713 and contained a reprint of the first formal treatise
42 Chapter 1 Fundamental Principles of Counting
on probability. This treatise had been written in 1657 by Christiaan Huygens (1629-1695),
the Dutch physicist, mathematician, and astronomer who discovered the rings of Saturn.
The binomial theorem for n = 2 appears in the work of Euclid (300 B.C.), but it was not
until the sixteenth century that the term “binomial coefficient” was actually introduced by
Michel Stifel (1486—1567). In his Arithmetica Integra (1544) he gives the binomial coeffi-
cients up to the order of n = 17. Blaise Pascal (1623-1662), in his research on probability,
published in the 1650s a treatise dealing with the relationships among binomial coefficients,
combinations, and polynomials. These results were used by Jakob Bernoulli in proving the
general form of the binomial theorem in a manner analogous to that presented in this chap-
ter. Actual use of the symbol (") did not begin until the nineteenth century, when it was
used by Andreas von Ettinghausen (1796-1878).
Blaise Pascal (1623-1662)
It was not until the twentieth century, however, that the advent of the computer made
possible the systematic analysis of processes and algorithms used to generate permutations
and combinations. We shall examine one such algorithm in Section 10.1.
The first comprehensive textbook dealing with topics in combinations and permutations
was written by W. A. Whitworth [10]. Also dealing with the material of this chapter are
Chapter 2 of D. I. Cohen (2], Chapter 1 of C. L. Liu [5], Chapter 2 of F. S$. Roberts [6],
Chapter 4 of K. H. Rosen [7], Chapter 1 of H. J. Ryser [8], and Chapter 5 of A. Tucker [9].
REFERENCES
1. Biggs, Norman L. “The Roots of Combinatorics.” Historia Mathematica 6 (1979): pp. 109-
136.
2. Cohen, Daniel I. A. Basic Techniques of Combinatorial Theory. New York: Wiley, 1978.
3. Gardner, Martin. “Mathematical Games, Catalan Numbers: An Integer Sequence that Materi-
alizes in Unexpected Places.” Scientific American 234, no. 6 (June 1976): pp. 120-125.
4, Heath, Thomas Little. A History of Greek Mathematics, vol. 1. Reprint of the 1921 edition.
New York: Dover Publications, 1981.
. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
ON tA
. Roberts, Fred S. Applied Combinatorics. Englewood Cliffs, N.J.: Prentice-Hall, 1984.
7. Rosen, Kenneth H. Discrete Mathematics and Its Applications, 5th ed. New York: McGraw-
Hill, 2003.
8. Ryser, H. J. Combinatorial Mathematics. Published by the Mathematical Association of
America. New York: Wiley, 1963.
Supplementary Exercises 43
9. Tucker, Alan. Applied Combinatorics, 4th ed. New York: Wiley, 2002.
10. Whitworth, W. A. Choice and Chance. Reprint of the 1901 edition. New York: Hafner, 1965.
b) the large blue plastic hexagonal block in exactly two
SUPPLEMENTARY EXERCISES - ways? (For example, the small red plastic hexagonal block
is one such block.)
10. Mr. and Mrs. Richardson want to name their new daughter
1. In the manufacture of a certain type of automobile, four
so that her initials (first, middle, and last) will be in alphabetical
kinds of major defects and seven kinds of minor defects can order with no repeated initial. How many such triples of initials
occur. For those situations in which defects do occur, in how
can occur under these circumstances?
many ways can there be twice as many minor defects as there
are major ones? 11. In how many ways can the 11 identical horses on a carousel
2. A machine has nine different dials, each with five settings be painted so that three are brown, three are white, and five are
labeled 0, 1, 2, 3, and 4. black?
a) In how many ways can all the dials on the machine be 12. In how many ways can a teacher distribute 12 different sci-
set? ence books among 16 students if (a) no student gets more than
b) If the nine dials are arranged in a line at the top of the one book? (b) the oldest student gets two books but no other
machine, how many of the machine settings have no two student gets more than one book?
adjacent dials with the same setting?
13. Four numbers are selected from the following list of num-
3. Twelve points are placed on the circumference of a circle bers: —5, -4, —3, ~2, -1, 1, 2,3, 4. (a) In how many ways can
and all the chords connecting these points are drawn. What is the selections be made so that the product of the four numbers
the largest number of points of intersection for these chords? is positive and (i) the numbers are distinct? (ii) each number
4. Achoir director must select six hymns for a Sunday church may be selected as many as four times? (iii) each number may
service. She has three hymn books, each containing 25 hymns be selected at most three times? (b) Answer part (a) with the
(there are 75 different hymns in all). In how many ways can product of the four numbers negative.
she select the hymns if she wishes to select (a) two hymns from
14, Waterbury Hall, a university residence hall for men, is op-
each book? (b) at least one hymn from each book?
erated under the supervision of Mr. Kelly. The residence has
5. How many ways are there to place 25 different flags on three floors, each of which is divided into four sections. This
10 numbered flagpoles if the order of the flags on a flagpole is coming fall Mr. Kelly will have 12 resident assistants (one for
(a) not relevant? (b) relevant? (c) relevant and every flagpole each of the 12 sections). Among these 12 assistants are the four
flies at least one flag? senior assistants — Mr. DiRocco, Mr. Fairbanks, Mr. Hyland,
6. A penny is tossed 60 times yielding 45 heads and 15S tails. and Mr. Thornhill. (The other eight assistants will be new this
In how many ways could this have happened so that there were fall and are designated as junior assistants.) In how many ways
no consecutive tails? can Mr. Kelly assign his 12 assistants if
7. There are 12 men at a dance. (a) In how many ways can a) there are no restrictions?
eight of them be selected to form a cleanup crew? (b) How b) Mr. DiRocco and Mr. Fairbanks must both be assigned
many ways are there to pair off eight women at the dance with to the first floor?
eight of these 12 men? c) Mr. Hyland and Mr. Thornhill must be assigned to dif-
8. In how many ways can the letters in WONDERING be ferent floors?
arranged with exactly two consecutive vowels? 15. a) How many of the 9000 four-digit integers 1000, 1001,
9. Dustin has a set of 180 distinct blocks. Each of these blocks 1002, ... , 9998, 9999 have four distinct digits that are ei-
is made of either wood or plastic and comes in one of three sizes ther increasing (as in 1347 and 6789) or decreasing (as in
(small, medium, large), five colors (red, white, blue, yellow, 6421 and 8653)?
green), and six shapes (triangular, square, rectangular, hexag- b) How many of the 9000 four-digit integers 1000, 1001,
onal, octagonal, circular). How many of the blocks in this set 1002, ..., 9998, 9999 have four digits that are either non-
differ from decreasing (as in 1347, 1226, and 7778) or nonincreasing
a) the small red wooden square block in exactly one way? (as in 6421, 6622, and 9888)?
(For example, the small red plastic square block is one such 16. a) Find the coefficient of x?yz? in the expansion of
block.) [(x/2) + y — 3zf.
44 Chapter 1 Fundamental Principles of Counting
b) How many distinct terms are there in the complete ex- 22. a) In how many ways can the letters in UNUSUAL be ar-
pansion of ranged?
b) For the arrangements in part (a), how many have all
5 +y—3z}3 ° y
? three U’s together?
c) How many of the arrangements in part (a) have no con-
c) What is the sum of all coefficients in the complete ex-
secutive U’s?
pansion?
23. Francesca has 20 different books but the shelf in her dor-
17. a) In how many ways can 10 people, denoted A, B,...,
mitory residence will hold only 12 of them,
I, J, be seated about the rectangular table shown in
Fig. 1.11, where Figs. 1.11(a) and 1.11(b) are considered a) In how many ways can Francesca line up 12 of these
the same but are considered different from Fig. 1.11(c)? books on her bookshelf?
b) In how many of the arrangements of part (a) arc A and B b) How many of the arrangements in part (a) include
seated on longer sides of the table across from each other? Francesca’s three books on tennis?
18. a) Determine the number of nonnegative integer solutions 24. Determine the value of the integer variable counter after
to the pair of equations execution of the following program segment. (Here i, /, k, /,
m, and n are integer variables. The variables r, s, and ¢ are
X) +X. + x3 = 6, XptxX2 ber tx = 15, also integer variables; their values— where r > 1, s > 5, and
x,
> 0, l1<is<5. t > 7 — have been set prior to this segment.)
b) Answer part (a) with the pair of equations replaced by counter := 10
the pair of inequalities for i :=1tol12do
for j :=1ltordo
Xx; + x2 + x3 <6, xy +X. +--+
+ x5 < 15,
counter := counter + 2
x, 20, 1<i<5. fork :=5tosdo
19. For any given set in a tennis tournament, opponent A can for 1 :=3 tok do
beat opponent B in seven different ways. (At 6-6 they play a counter := counter +4
tie breaker.) The first opponent to win three sets wins the tour- for m := 3 to 12 do
nament. (a) In how many ways can scores be recorded with counter := counter +6
A winning in five sets? (b) In how many ways can scores be for n := t downto 7 do
recorded with the tournament requiring at least four sets? counter := counter + 8
20. Given n distinct objects, determine in how many ways r of
25. a) Find the number of ways to write 17 as a sum of 1’s and
these objects can be arranged in a circle, where arrangements
2’s if order is relevant.
are considered the same if one can be obtained from the other
by rotation. b) Answer part (a) for 18 in place of 17.
21. For every positive integer n, show that ¢) Generalize the results in parts (a) and (b) for # odd and
for m even.
n 4 # + fn des nh 4 hh + Ht a
0 2 4 ] 3 5
A B F G J
J C E H H A
D D G B
H E C J F C
G F B A E D
(a) (b) (c)
Figure 1.11
Supplementary Exercises 45
26. a) In how many ways can 17 be written as a sum of 2’s bers will each select one of the candidates to be the winner and
and 3’s if the order of the summands is (i) not relevant? place his or her choice (checked off on a ballot) into the bal-
(ii) relevant? lot box. Suppose that Katalin receives nine votes and Donna
b) Answer part (a) for 18 in place of 17. receives five. In how many ways can the ballots be selected,
one at a time, from the ballot box so that there are always more
27, a) If m and r are positive integers with n > r, how many
votes in favor of Katalin? [This is a special case of a general
solutions are there to
problem called, appropriately, the ballot problem. This problem
Xp tXg brs tty = A, was solved by Joseph Louis Fran¢gois Bertrand (1822-1900).]
31. Consider the 8 X 5 grid shown in Fig. 1.13. How many
where each x, is a positive integer, for 1 <i <r?
different rectangles (with integer-coordinate corners) does this
b) In how many ways can a positive integer 7 be written grid contain? [For example, there is a rectangle (square) with
as a sum of r positive integer summands (1 <r <n) if the corners (1, 1), (2, 1), (2, 2), (1, 2), asecond rectangle with cor-
order of the summands is relevant? ners (3, 2), (4, 2), (4, 4), (3, 4), anda third with corners (5, 0),
28. a) In how many ways can one travel in the x y-plane from (7, 0), (7, 3) (S, 3).
(1, 2) to (5, 9) if each move is one of the following types:
y
(R): @, y) > @+I1,y)5 (): @&y) > @ y+?
b) Answer part (a) if a third (diagonal) move 5 e ot
(D): (x, y) > (x + Ieyt])) 4
is also possible. 3
29, a) In how many ways can a particle move in the xy-plane 2
from the origin to the point (7, 4) if the moves that are
allowed are of the form: 1
(R): (x, y) > (+1, yy); (O): &, y) > (x, y+ 1)? L.,
1 2 3 4 5 6 7 8
b) How many of the paths in part (a) do not use the path Figure 1.13
from (2, 2) to (3, 2) to (4, 2) to (4, 3) shown in Fig. 1.12?
c) Answer parts (a) and (b) if a third type of move 32. As head of quality control, Silvia examined 15 motors, one
at a time, and found six defective (D) motors and nine in good
(D): x, y) > @t+ly+))
(G) working condition. If she listed each finding (of D or G) af-
is also allowed. ter examining each individual motor, in how many ways could
Silvia’s list start with a run of three G’s and have six runs in
y total?
4
| | 33. In order to graduate on schedule, Hunter must take (and
pass) four mathematics electives during his final six quarters. If
he may select these electives from a list of 12 (that are offered
every quarter) and he does not want to take more than one of
these electives in any given quarter, in how many ways can he
select and schedule these four electives?
34. In how many ways can a family of four (mother, father,
1 2 3 4 5 6 7 and two children) be seated at a round table, with eight other
Figure 1.12 people, so that the parents are seated next to each other and
there is one child on a side of each parent? (Two seatings are
30. Due to their outstanding academic records, Donna and considered the same if one can be rotated to look like the other.)
Katalin are the finalists for the outstanding physics student (in
their college graduating class). Acommittee of 14 faculty mem-
Fundamentals
of Logic
I:
the first chapter we derived a summation formula in Example 1.40 (Section 1.4). We
obtained this formula by counting the same collection of objects (the statements that were
executed in a certain program segment) in two different ways and then equating the results.
Consequently, we say that the formula was established by a combinatorial proof. This is
one of many different techniques for arriving at a proof.
In this chapter we take a close look at what constitutes a valid argument and a more
conventional proof. When a mathematician wishes to provide a proof for a given situation,
he or she must use a system of logic. This is also true when a computer scientist develops
the algorithms needed for a program or system of programs. The logic of mathematics is
applied to decide whether one statement follows from, or is a logical consequence of, one
or more other statements.
Some of the rules that govern this process are described in this chapter. We shall use these
rules in proofs (provided in the text and required in the exercises) throughout subsequent
chapters. However, at no time can we hope to arrive at a point at which we can apply the
rules in an automatic fashion. As in applying the counting ideas discussed in Chapter 1,
we should always analyze and seek to understand the situation given. This often calls for
attributes we cannot learn in a book, such as insight and creativity. Merely trying to apply
formulas or invoke rules will not get us very far either in proving results (such as theorems)
or in doing enumeration problems.
2.1
Basic Connectives and Truth Tables
In the development of any mathematical theory, assertions are made in the form of sen-
tences. Such verbal or written assertions, called statements (or propositions), are declarative
sentences that are either true or false — but not both. For example, the following are state-
ments, and we use the lowercase letters of the alphabet (such as p, g, andr) to represent
these statements.
p: Combinatorics is a required course for sophomores.
gq: Margaret Mitchell wrote Gone with the Wind.
r: 24+3=5.
47
48 Chapter 2 Fundamentals of Logic
On the other hand, we do not regard sentences such as the exclamation
“What a beautiful evening!”
or the command
‘Get up and do your exercises.”
as Statements since they do not have truth values (true or false).
The preceding statements represented by the letters p, g, and r are considered to be
primitive statements, for there is really no way to break them down into anything simpler.
New statements can be obtained from existing ones in two ways.
1) Transform a given statement p into the statement —p, which denotes its negation and
is read “Not p.”
For the statement p above, —p is the statement “Combinatorics is not a required
course for sophomores.” (We do not consider the negation of a primitive statement
to be a primitive statement.)
2) Combine two or more statements into a compound statement, using the following
logical connectives.
a) Conjunction: The conjunction of the statements p, g is denoted by p A qg, which
is read “p and g.” In our example the compound statement p A gq is read “Combi-
natorics is a required course for sophomores, and Margaret Mitchell wrote Gone
with the Wind.”
b) Disjunction: The expression p V g denotes the disjunction of the statements p, g
and is read “p or g.”’ Hence “Combinatorics is a required course for sophomores,
or Margaret Mitchell wrote Gone with the Wind” is the verbal translation for
pq, when p, q are as above. We use the word “or” in the inclusive sense here.
Consequently, p V g is true if one or the other of p, g is true or if both of the
statements p, q are true. In English we sometimes write “and/or” to point this out.
The exclusive “or” is denoted by p VY g. The compound statement p Y gq is true if
one or the other of p, g is true but not both of the statements p, g are true. One
way to express p Y gq for the example here is “Combinatorics is a required course
for sophomores, or Margaret Mitchell wrote Gone with the Wind, but not both.”
c) Implication: We say that “‘p implies g” and write p — gq to designate the statement,
which is the implication of gq by p. Alternatively, we can also say
(i) “If p, then g.” (ii) “p is sufficient for g.”
(iii) “‘p is a sufficient condition for q.” (iv) “q is necessary for p.”
(v) “g is anecessary condition for p.” (vi) “p only if g.”
A verbal translation of p — gq for our example is “If combinatorics is a required
course for sophomores, then Margaret Mitchell wrote Gone with the Wind.” The
statement p is called the hypothesis of the implication; g is called the conclu-
sion, When statements are combined in this manner, there need not be any causal
relationship between the statements for the implication to be true.
d) Biconditional: Last, the biconditional of two statements p,q,is denoted by p <> q,
which is read “p if and only if g,” or “p is necessary and sufficient for g.”’ For
our p, g, “Combinatorics is a required course for sophomores if and only if
Margaret Mitchell wrote Gone with the Wind” conveys the meaning of p = q.
We sometimes abbreviate “p if and only if g” as “p iff q.”
Throughout our discussion on logic we must realize that a sentence such as
“The number x is an integer.”
2.1 Basic Connectives and Truth Tables 49
is not a statement because its truth value (true or false) cannot be determined until a nu-
merical value is assigned for x. If x were assigned the value 7, the result would be a true
statement. Assigning x a value such as 4, /2, or 2, however, would make the resulting
statement false. (We shall encounter this type of situation again in Sections 2.4 and 2.5 of
this chapter.)
In the foregoing discussion, we mentioned the circumstances under which the compound
statements p V q, p Y g are considered true, on the basis of the truth of their components
p,q. This idea of the truth or falsity of a compound statement being dependent only on the
truth values of its components is worth further investigation. Tables 2.1 and 2.2 summarize
the truth and falsity of the negation and the different kinds of compound statements on the
basis of the truth values of their components. In constructing such truth tables, we write
“0” for false and “1” for true.
Table 2.1 Table 2.2
p| 7p P|@9|PAQ| PYG | p“q | p>qg| peg
0 1 010 0 0 0 1 1
1 0 0} 1 0 1 1 l 0
1 | 0 0 1 1 0 0
1] 1 1 1 0 1 1
The four possible truth assignments for p, g can be listed in any order. For later work,
the particular order presented here will prove useful.
We see that the columns of truth values for p and —p are the opposite of each other. The
statement p A q is true only when both p, qg are true, whereas p V q is false only when both
the component statements p, g are false. As we noted before, p Y q is true when exactly
one of p, g is true.
For the implication p — q, the result is true in all cases except where p is true and g
is false. We do not want a true statement to lead us into believing something that is false.
However, we regard as true a statement such as “If 2 + 3 = 6, then 2 + 4 = 7,” even though
the statements “2 + 3 = 6” and “2 + 4 = 7” are both false.
Finally, the biconditional p < q is true when the statements p, g have the same truth
value and is false otherwise.
Now that we have been introduced to certain concepts, let us investigate a little further
some of these initial ideas about connectives. Our first two examples should prove useful
for such an investigation.
Let s, f, and u denote the following primitive statements:
EXAMPLE 2.1
s: Phyllis goes out for a walk.
t: The moon is out.
us: Itis snowing.
The following English sentences provide possible translations for the given (symbolic)
compound statements.
a) (tf A -u) — s: If the moon is out and it is not snowing, then Phyllis goes out for a
walk,
50 Chapter 2 Fundamentals of Logic
b) t > (-u — s): If the moon is out, then if it is not snowing Phyllis goes out for a
walk. [So ~u —> s is understood to mean (—u) —> s as opposed to —(u > s).]
c) -(s @ (u V £)): It is not the case that Phyllis goes out for a walk if and only if it is
snowing or the moon is out.
Now we will work in reverse order and examine the logical (or symbolic) notation for
three given English sentences:
d) “Phyllis will go out walking if and only if the moon is out.” Here the words “if
and only if” indicate that we are dealing with a biconditional. In symbolic form this
becomes s <> f.
e) “If it is snowing and the moon is not out, then Phyllis will not go out for a walk.”
This compound statement is an implication where the hypothesis is also a compound
statement. One may express this statement in symbolic form as (u A -t) > -s.
f) “It is snowing but Phyllis will still go out for a walk.” Now we come across a new
connective — namely, but. In our study of logic we shall follow the convention that
the connectives but and and convey the same meaning. Consequently, this sentence
may be represented as u A s.
Now let us return to the results in Table 2.2, particularly the sixth column. For if this is
one’s first encounter with the truth table for the implication p — q, then it may be somewhat
difficult to accept the stated entries — especially the results in the first two rows (where p has
the truth value 0). The following example should help make these truth value assignments
easier to grasp.
Consider the following scenario. It is almost the week before Christmas and Penny will be
EXAMPLE 2.2
attending several parties that week. Ever conscious of her weight, she plans not to weigh
herself until the day after Christmas. Considering what those parties may do to her waistline
by then, she makes the following resolution for the December 26 outcome: “If I weigh more
than 120 pounds, then I shall enroll in an exercise class.”
Here we let p and g denote the (primitive) statements
p: weigh more than 120 pounds.
q: Ishall enroll in an exercise class.
Then Penny’s statement (implication) is given by p > q.
We shall consider the truth values of this particular example of p — q for the rows of
Table 2.2. Consider first the easier cases in rows 4 and 3.
@ Row 4: p and g both have the truth value 1. On December 26 Penny finds that she
weighs more than 120 pounds and promptly enrolls in an exercise class, just as she said
she would. Here we consider p — q to be true and assign it the truth value 1.
® Row 3: p has the truth value 1, g has the truth value 0. Now that December 26 has
arrived, Penny finds her weight to be over 120 pounds, but she makes no attempt to enroll
in an exercise class. In this case we feel that Penny has broken her resolution — in other
words, the implication p — gq is false (and has the truth value 0).
The cases in rows | and 2 may not immediately agree with our intuition, but the example
should make these results a little easier to accept.
2.1 Basic Connectives and Truth Tables 51
@ Row |: p and g both have the truth value 0. Here Penny finds that on December 26
her weight is 120 pounds or less and she does not enroll in an exercise class. She has not
violated her resolution; we take her statement p — g to be true and assign it the truth
value 1.
@ Row 2: p has the truth value 0, g has the truth value 1. This last case finds Penny
weighing 120 pounds or less on December 26 but still enrolling in an exercise class.
Perhaps her weight is 119 or 120 pounds and she feels this is still too high. Or maybe
she wants to join an exercise class because she thinks it will be good for her health. No
matter what the reason, she has not gone against her resolution p — gq. Once again, we
accept this compound statement as true, assigning it the truth value 1.
Our next example discusses a related notion: the decision (or selection) structure in
computer programming.
In computer science the if-then and if-then-else decision structures arise (in various for-
EXAMPLE 2.3
mats) in high-level programming languages such as Java and C++. The hypothesis p is often
a relational expression such as x > 2. This expression then becomes a (logical) statement
that has the truth value 0 or 1, depending on the value of the variable x at that point in
the program. The conclusion g is usually an “executable statement.” (So g is not one of
the logical statements that we have been discussing.) When dealing with “if p then g,” in
this context, the computer executes g only on the condition that p is true. For p false, the
computer goes to the next instruction in the program sequence. For the decision structure
“if p then g else 7,” g is executed when p is true and r is executed when p is false.
Before continuing, a word of caution: Be careful when using the symbols > and @ . The
implication and the biconditional are not the same, as evidenced by the last two columns
of Table 2.2.
In our everyday language, however, we often find situations where an implication is used
when the intention actually calls for a biconditional. For example, consider the following
implications that a certain parent might direct to his or her child.
s —t: If you do your homework, then you will get to watch the baseball game.
t-—»s: You will get to watch the baseball game only if you do your homework.
e Case |: The implication s — t. When the parent says to the child, “If you do your
homework, then you will get to watch the baseball game,” he or she is trying a positive
approach by emphasizing the enjoyment in watching the baseball game.
© Case 2: The implication tf — s. Here we find the negative approach and the parent who
warns the child in saying, “You will get to watch the baseball game only if you do your
homework.” This parent places the emphasis on the punishment (lack of enjoyment) to
be incurred.
In either case, the parent probably wants his or her implication — be its > t ort > s —
to be understood as the biconditional s < ¢. For in case 1 the parent wants to hint at the
punishment while promising the enjoyment; in case 2, where the punishment has been
used (perhaps, to threaten), if the child does in fact do the homework, then that child will
definitely be given the opportunity to enjoy watching the baseball game.
52 Chapter 2. Fundamentals of Logic
In scientific writing one must make every effort to be unambiguous — when an im-
plication is given, it ordinarily cannot, and should not, be interpreted as a biconditional.
Definitions are a notable exception, which we shall discuss in Section 2.5.
Before we continue let us take a step back. When we summarized the material that
gave us Tables 2.1 and 2.2, we may not have stressed enough that the results were for any
statements p, ¢ — not just primitive statements p, g. Examples 2.4 through 2.6 should help
to reinforce this.
Let us examine the truth table for the compound statement “Margaret Mitchell wrote Gone
EXAMPLE 2.4
with the Wind, and if 2 + 3 # 5, then combinatorics is a required course for sophomores.”
In symbolic notation this statement is written as g A (~r > p), where p,q, andr represent
the primitive statements introduced at the start of this section. The last column of Table 2.3
contains the truth values for this result. We obtained these truth values by using the fact
that the conjunction of any two statements is true if and only if both statements are true.
This is what we said earlier in Table 2.2, and now one of our statements
— namely, the
implication —r — p— is definitely a compound statement, not a primitive one. Columns
4, 5, and 6 in this table show how we build the truth table up by considering smaller parts
of the compound statement and by using the results from Tables 2.1 and 2.2.
Table 2.3
P\|@ir|~w|-wrs>p |] qa(-ra p)
0|0;,0] 1 0 0
0;0;11 0 ] 0
QO}; 1 ]0} 1 0 0
O;1}]1)] 0 ] 1
1/0; 07 1 ] 0
1/0};1] 0 ] 0
1/1/07 1 1 ]
1; 1}]1 7] 0 1 1
In Table 2.4 we develop the truth tables for the compound statements p Vv (g Ar) (col-
EXAMPLE 2.5
umn 5) and (p v g) Ar (column 7).
Table 2.4
P|\|q\|ri{qaar| pv(@ar) | pv@!| (pyqgar
0|0] 0 0 0 0 0
0; 0] J 0 0 0 0
0O/1!0 0 0 ] 0
O| 1] 1 1 ] ] ]
1/|0};,0 0 l 1 0
1/0] 1 0 1 ] 1
1/1)0 0 ] ] 6)
1/141 l 1 1 1
2.1 Basic Connectives and Truth Tables 53
Because the truth values in columns 5 and 7 differ (in rows 5 and 7), we must avoid
writing a compound statement such as p V q A r. Without parentheses to indicate which of
the connectives V and A should be applied first, we have no idea whether we are dealing
with p V (¢g Ar) or(pVq)Ar.
Our last example for this section illustrates two special types of statements.
The results in columns 4 and 7 of Table 2.5 reveal that the statement p > (p V q) is true and
EXAMPLE 2.6
that the statement p A (—p A q) is false for all truth value assignments for the component
statements p, q.
Table 2.5
P|@| PVG | p>(pvq@) | 7p | 7pAg | pA(a=pag)
0/0 0 l 1 0 0
0} 1 1 1 1 1 0
110 1 1 0 0 0
1 1 ] 1 0 0 0
Definition 2.1 A compound statement is called a tautology if it is true for all truth value assignments for
its component statements. If a compound statement is false for all such assignments, then
it is called a contradiction.
Throughout this chapter we shall use the symbol 7p to denote any tautology and the
symbol Fo to denote any contradiction.
We can use the ideas of tautology and implication to describe what we mean bya valid
argument. This will be of primary interest to us in Section 2.3, and it will help us develop
needed skills for proving mathematical theorems. In general, an argument starts with a list
of given statements called premises and a statement called the conclusion of the argument.
We examine these premises, say P|, P2, P3,--.. Pn, and try to show that the conclusion
q follows logically from these given statements — that is, we try to show that if each of
Pi, P2, P3,--+, Pn iS a true statement, then the statement gq is also true. To do so one way
is to examine the implication
(pi A po A pa A+++ pa)’ > 4,
where the hypothesisis the conjunction of then premises. Ifany one of py, p2, P3,..., Pais
false, then no matter what truth valueg has, the implication (p) A p2 A p3 A---A Pn) >
is true. Consequently, if we start with the premises p), p2, P3,..., Py —each with truth
value 1 — and find that under these circumstances gq also has the value 1, then the implication
(pi A p2A p3A-°-A Py) > @
is a tautology and we have a valid argument.
Tat this point we have dealt only with the conjunction of two statements, so we must point out that the
conjunction py A p2 A p3A--+A p, of n statements is true if and only if each p,, 1 <i <n, is true. We shall
deal with this generalized conjunction in detail in Example 4.16 of Section 4.2.
54 Chapter 2. Fundamentals of Logic
7. Rewrite each of the following statements as an implication
EXERCISES 2.1 in the if-then form.
a) Practicing her serve daily is a sufficient condition for
1, Determine whether each of the following sentences is a
Darci to have a good chance of winning the tennis tourna-
statement.
ment.
a) In 2003 George W. Bush was the president of the United
b) Fix my air conditioner or [ won’t pay the rent.
States.
c) Mary will be allowed on Larry’s motorcycle only if she
b) x + 3 is a positive integer.
wears her helmet.
c) Fifteen is an even number.
8. Construct a truth table for each of the following compound
d) If Jennifer is late for the party, then her cousin Zachary statements, where p, g, r denote primitive statements.
will be quite angry.
a) -(pV 7q) > 7p b) p> (qr)
e) What time is it?
©) (p>qgor d) (p>
gq) > (¢> p)
f) As of June 30, 2003, Christine Marie Evert had won the
French Open a record seven times. e) [PA(p>@]> 4 f) (pAq)>Pp
2. Identify the primitive statements in Exercise 1. 8) 9 > (-pVv 79)
3. Let p, g be primitive statements for which the implication
h) (p> g)A@G>r)l> (pr)
p — gq is false. Determine the truth values for each of the fol- 9. Which of the compound statements in Exercise 8 are
lowing. tautologies?
10. Verify that [p > (q>rn)>-lpoqgdoeworj)jisa
a) pAq b) ~pVq c)q>p d) -g-—-
7p
tautology.
4, Let p, g, r, s denote the following statements:
11. a) How many rows are needed for the truth table of the
p: I finish writing my computer program before lunch. compound statement (p V 7g) © [(—r As) > ft], where
gq: Ishall play tennis in the afternoon. p.g,¥, 8, and ¢ are primitive statements?
r: The sun is shining.
b) Let p;, p2,..., Pp, denote n primitive statements. Let
s: The humidity is low.
p be a compound statement that contains at least one oc-
Write the following in symbolic form. currence each of p,, for 1 <i <n—and p contains no
other primitive statement. How many rows are needed to
a) If the sun is shining, I shall play tennis this afternoon.
construct the truth table for p?
b) Finishing the writing of my computer program before
12. Determine all truth value assignments, if any, for the prim-
lunch is necessary for my playing tennis this afternoon.
itive statements p, g, r, s, t that make each of the following
c) Low humidity and sunshine are sufficient for me to play compound statements false.
tennis this afternoon.
a) [(p Ag) Ar] > (Vt)
5. Let p, g, r denote the following statements about a partic-
ular triangle ABC. b) [PA (@ Ar] > (s V2)
13. If statement g has the truth value 1, determine all truth value
p: Triangle ABC is isosceles.
assignments for the primitive statements, p, 7, and s for which
g: Triangle ABC is equilateral, the truth value of the statement
r: Triangle ABC is equiangular,
(q > (sp Vr) Ans) A [78 > (or Aq)
Translate each of the following into an English sentence.
is 1.
a)q->p b) ~p> —q 14, At the start of a program (written in pseudocode) the inte-
dgqer d) pA 74 ger variable n is assigned the value 7. Determine the value of
e)r—>p n after each of the following successive statements is encoun-
6. Determine the truth value of each of the following impli- tered during the execution of this program. [Here the value of
cations. n following the execution of the statement in part (a) becomes
the value of » for the statement in part (b), and so on, through
a) (f3+4=
12, then3 +2 = 6. the statement in part (d). For positive integers a, b, |a/b]| re-
b) [f3 +3 =6, then3+4=9. turns the integer part of the quotient— for example, [6/2] = 3,
c) If Thomas Jefferson was the third president of the United [7/2] = 3, [2/5] = 0, and [8/3] = 2.}
States, then 2 + 3 = 5. a) ifn>S5thenn :=n+2
2.2 Logical Equivalence: The Laws of Logic 55
b) if ((n+2=8)
or (n-3=6)) then for i:=l1toemdo
n:=2* n+l for j :=1tondo
c) if ((n
- 3 =16) and ([n/6] =1)) then if if j then
n:=n4+3 print i+j
d) if ((n
4 21) and (n-7=15)) then How many times is the print statement in the segment exe-
nm:=n-4 cuted when (a) m = 10, n = 10; (b) m = 20, n = 20; (c) m =
15, The integer variables m and n are assigned the values 3 10, n = 20; (d)m = 20, n = 10?
and 8, respectively, during the execution of a program (written 17. After baking a pie for the two nieces and two nephews who
in pseudocode). Each of the following successive statements is are visiting her, Aunt Nellie leaves the pie on her kitchen ta-
then encountered during program execution. [Here the values ble to cool. Then she drives to the mall to close her boutique
of m, n following the execution of the statement in part (a) be- for the day. Upon her return she finds that someone has eaten
come the values of m, n for the statement in part (b), and so on, one-quarter of the pie. Since no one was in her house that day —
through the statement in part (e).] What are the values of m, n except for the four visitors — Aunt Nellie questions each niece
after each of these statements is encountered? and nephew about who ate the piece of pie. The four “suspects”
a) ifn-m=S5thenn :=n-2 tell her the following:
b) if ((2* m=n) and (|n/4|]=1)) then Charles: Kelly ate the piece of pie.
n:=-4*m-3 Dawn: I did not eat the piece of pie.
c) if ((n< 8) or (|m/2]=2)) thenn:=2*m Kelly: Tyler ate the pie.
elsem:=2*n Tyler: Kelly lied when she said I ate the pie.
d) if ((m< 20) and (|n/6] =1)) then If only one of these four statements is true and only one of
Mm:=m-n-5 the four committed this heinous crime, who is the vile culprit
that Aunt Nellie will have to punish severely?
e) if ((n=2*
m) or (|n/2|=5)) then
m:=m+2
16. In the following program segment i, j, m, and n are integer
variables. The values of m and n are supplied by the user earlier
in the execution of the total program.
2.2
Logical Equivalence: The Laws of Logic
In all areas of mathematics we need to know when the entities we are studying are equal or
essentially the same. For example, in arithmetic and algebra we know that two nonzero real
numbers are equal when they have the same magnitude and algebraic sign. Hence, for two
nonzero real numbers x, y, we have x = y if |x| = |y| and xy > O, and conversely (that is,
if x = y, then |x| = |y| and xy > 0). When we deal with triangles in geometry, the notion
of congruence arises. Here triangle ABC and triangle DE F are congruent if, for instance,
they have equal corresponding sides — that is, the length of side AB = the length of side
DE, the length of side BC = the length of side E F, and the length of side CA = the length
of side FD.
Our study of logic is often referred to as the algebra of propositions (as opposed to the
algebra of rea] numbers). In this algebra we shall use the truth tables of the statements,
or propositions, to develop an idea of when two such entities are essentially the same. We
begin with an example.
For primitive statements p and qg, Table 2.6 provides the truth tables for the compound
EXAMPLE 2.7
statements =p V q and p — q. Here we see that the corresponding truth tables for the two
statements —p V g and p — q are exactly the same.
56 Chapter 2 Fundamentals of Logic
Table 2.6
“P| TPY@ | P> 4
ls
Oreo|}
Oe —_
re
OO
Ore
oo
=r
—-
—-
-
This situation leads us to the following idea.
Definition 2.2 Two statements 5), 52 are said to be logically equivalent, and we write 5; <> 52, when the
statement s is true (respectively, false) if and only if the statement s> is true (respectively,
false).
Note that when s; <> s2 the statements s; and s2 provide the same truth tables because
S51, S2 have the same truth values for all choices of truth values for their primitive compo-
nents.
As aresult of this concept we see that we can express the connective for the implication (of
primitive statements) in terms of negation and disjunction — that is, (p > g) <> —~p vq.
In the same manner, from the result in Table 2.7 we have (p — gq) =} (p> q) A(q > Pp),
and this helps validate the use of the term biconditional. Using the logical equivalence from
Table 2.6, we find that we can also write (p @ q) <=} (—p V q) A (-4 V p). Consequently,
if we so choose, we can eliminate the connectives — and < from compound statements.
Table 2.7
P|94\)p7q|\|qar>p|WoqgaAdgap) | peg
0|0 l 1 1 l
0; 1 1 0 0 0
1 | 0 0 ] 0 0
1) 1 l 1 1 1
Examining Table 2.8, we find that negation, along with the connectives A and Vv, are all
we need to replace the exclusive or connective, V. In fact, we may even eliminate either A
or V. However, for the related applications we want to study later in the text, we shall need
both A and V as well as negation.
Table 2.8
P|\|qd\p%@| pvq | pag | 7(pAqg) | (pvgan(pag)
0; 0 0 0 0 1 0
0! 1 l 1 0 1 ]
1/0 ] 1 0 ] ]
] 1 0 1 1 0 0
2.2 Logical Equivalence: The Laws of Logic 57
We now use the idea of logical equivalence to examine some of the important properties
that hold for the algebra of propositions.
For all real numbers a, b, we know that —(a + b) = (—a) + (—b). Is there acomparable
result for primitive statements p, q?
In Table 2.9 we have constructed the truth tables for the statements —(p Ag), —p V 7g,
EXAMPLE 2.8
—(p\Vq), and —p A —q, where p, g are primitive statements. Columns 4 and 7 reveal
that —(p Ag) <> —p V -q; columns 9 and 10 reveal that -(p V g) <> —p A -q. These
results are known as DeMorgan’s Laws. They are similar to the familiar law for real numbers,
—(a + b) = (—a) + (—B),
already noted, which shows the negative of a sum to be equal to the sum of the nega-
tives. Here, however, a crucial difference emerges: The negation of the conjunction of two
primitive statements p, g results in the disjunction of their negations —~p, -g, whereas
the negation of the disjunction of these same statements p, g is logically equivalent to the
conjunction of their negations =p, —q.
Table 2.9
pP\|q|paq | -~(paqg) | ~p | ~@ | ~pvVn7d | PYG | ~@Vvq | ~pAa-@
010 0 1 ] l ] 0 l 1
0} 1 0 1 1 0 l 1 0 0
1/0 0 1 0 ] ] 1 0 )
1/1 | 0 0 0 0 1 0 0
Although p, g were primitive statements in the preceding example we shall soon learn
that DeMorgan’s Laws hold for any two arbitrary statements.
In the arithmetic of real numbers, the operations of addition and multiplication are both
involved in the principle called the Distributive Law of Multiplication over Addition: For
all real numbers a, b, c,
ax(b+ec)=(aXb)4+(aXc).
The next example shows that there is a similar law for primitive statements. There is also
a second related law (for primitive statements) that has no counterpart in the arithmetic of
real numbers.
EXAMPLE 2.9 _| Table 2.10 contains the truth tables for the statements p A (q Vr), (pAgqg)V(pAP),
pV (q Ar), and (pV g) A(p vr). From the table it follows that for all primitive state-
ments p, g, andr,
DACGVT) SS (PAQGV(PAr) The Distributive Law of A over v
PV (GAr) SS (PVQA(pyvr) The Distributive
Law of V overA
The second distributive law has no counterpart in the arithmetic of real numbers. That
is, it is not true for all real numbers a, b, and c that the following holds: a + (b X c) =
(a+b) X (a+c). For a=2, b=3, and c=5, for instance, a+ (bX c) = 17 but
(a+b) X (a+c) = 35.
58 Chapter 2. Fundamentals of Logic
Table 2.10
P\qiri| pat@vr) | DAGV(pAr) | pV(@GAr) | PYQADVr)
0/0] 0 0 0 0 0
0/0} 1 0 0 0 0
0| 110 0 0 0 0
QO; 14 1 0 0 ] l
1}0]0 0 0 ] 1
1} 0/1 | ] ] 1
1} 140 ] 1 l 1
1; 1]1 ] ] ] 1
Before going any further, we note that, in general, if s,, s2 are statements and 5s; © s2
is a tautology, then s;, s2 must have the same corresponding truth values (that is, for each
assignment of truth values to the primitive statements in s; and sz, 5s; is true if and only
if sp is true and s, is false if and only if sz is false) and s; <> s2. When s; and s are
logically equivalent statements (that is, s; <> s2), then the compound statement 5; © 5s» is
a tautology. Under these circumstances it is also true that =s} <> —5>, and 7s, <> —5p is
a tautology.
If 5), s2, and s3 are statements where s; <> so and s2 <> s3 then s; <> 53. When two
statements s; and sz are not logically equivalent, we may write s; <4 s2 to designate this
situation.
Using the concepts of logical equivalence, tautology, and contradiction, we state the
following list of laws for the algebra of propositions.
The Laws of Logic
For any primitive statements p, g, r, any tautology Tp, and any contradiction Fo,
l)h-sp<p . Law of Double Negation
2) ~(p V gq) = =p A™G DeMorgan’s Laws
—(p Ag) <=> =p Vv mg .
3) pV¥Vae>qVvp Commutative Laws
PAGS>qAp :
4) pV(qvryes(pvgvr' - Associative Laws
PAW Ar) <= (PAG
Ar
5S) pV(gAr)e>(pVqg)A(pvr) _ Distributive Laws
PAG Vr) = (PAG) V (PAT) .
6) pVp<>p . idempotent Laws
PAP p
7) pV Fo > p Identity Laws
PAT) <> p :
‘We note that because of the Associative Laws, there is no ambiguity in statements of the form p Vv g V r or
PAQGAY.
2.2 Logical Equivalence: The Laws of Logic 59
8) pV ~p <> Th Inverse Laws
pAnp <> Fo
9) pV T) <> To Domination Laws
PAI <= fy
10) pV (pAqg) => p Absorption Laws
DA(PYV
q) => p
We now turn our attention to proving all of these properties. In so doing we realize that
we could simply construct the truth tables and compare the results for the corresponding
truth values in each case —as we did in Examples 2.8 and 2.9. However, before we start
writing, let us take one more look at this list of 19 laws, which, aside from the Law of
Double Negation, fall naturally into pairs. This pairing idea will help us after we examine
the following concept.
Definition 2.3 Let s be a statement. If s contains no logical connectives other than A and Vv, then the dual
of s, denoted s@, is the statement obtained from s by replacing each occurrence of A and Vv
by v and A, respectively, and each occurrence of Ty and Fo by Fo and 7p, respectively.
If p is any primitive statement, then p* is the same as p — that is, the dual of a primitive
statement is simply the same primitive statement. And (—p)? is the same as sp. The
statements p V —p and p A —p are duals of each other whenever p is primitive — and so
are the statements p V Ty and p A Fo.
Given the primitive statements p,q, r and the compound statement
St (pA7q)V (FAT),
we find that the dual of s is
sf: (pV ag) A(rv Fo).
(Note that —g is unchanged as we go from s to s@.)
We now state and use a theorem without proving it. However, in Chapter 15 we shall
justify the result that appears here.
THEOREM 2.1 The Principle of Duality. Let s and t be statements that contain no logical connectives other
than A and v. Ifs <1, then s? <> #7,
As a result, laws 2 through 10 in our list can be established by proving one of the laws
in each pair and then invoking this principle.
We also find that it is possible to derive many other logical equivalences. For example,
if g, r, Ss are primitive statements, the results in columns 5 and 7 of Table 2.11 show us that
(,r¥AS)>~qG=r-(rAs)Vq
or that [7 As) ~ gq] [-( A 5) Vq] is a tautology. However, instead of always con-
structing more (and, unfortunately, larger) truth tables it might be a good idea to recall from
Example 2.7 that for primitive statements p, q, the compound statement
(p>oqge(pyvg)
60 Chapter 2. Fundamentals of Logic
Table 2.11
q\iri|is {ras | (WAS)>@ | 7AVrAs) | 7A(ras) vg
0/0190 0 1 ] ]
0/;0) 1 0 ] 1 1
0/1190 0 1 ] ]
O;1]1 1 0 0 0
1/0{90 0 1 1 l
1/0) 1 0 1 1 1
1]/1);90 0 1 ] ]
1/1) 1 1 I 0 1
is a tautology. If we were to replace each occurrence of this primitive statement p by the
compound statement r A s, then we would obtain the earlier tautology
[rAs)>~@q]leo[-(As) vq].
What has happened here illustrates the first of the following two substitution rules:
1) Suppose that the compound statement P is a tautology. If p is a primitive statement
that appears in P and we replace each occurrence of p by the same statement q, then
the resulting compound statement P, is also a tautology.
2) Let P be a compound statement where p is an arbitrary statement that appears in
P, and let g be a statement such that g <> p. Suppose that in P we replace one or
more occurrences of p by g. Then this replacement yields the compound statement
P,. Under these circumstances P,; <> P.
These rules are further illustrated in the following two examples.
EXAMPLE 2.10 a) From the first of DeMorgan’s Laws we know that for all primitive statements p, g,
: the compound statement
P: >(pvq) << (-pa-q)
is a tautology. When we replace each occurrence of p by r A 5, it follows from the
first substitution rule that
Py -AlrAs) Vg] eo[-r As) Ang]
is also a tautology. Extending this result one step further, we may replace each occur-
rence of g by t > u. The same substitution rule now yields the tautology
Py aAlrAs)Vit>oulol[A(raAs)A-G > uv),
and hence, by the remarks following shortly after Example 2.9, the logical equivalence
“[(rAs)Vitaw]esl[A(r as) Ant > u)].
b) For primitive statements p, g, we learn from the last column of Table 2.12 that the
compound statement [p A (p — g)]— q isa tautology. Consequently, ifr, s, t, u are
any statements, then by the first substitution rule we obtain the new tautology
lr > s)A[4 > 5) > (+t Vu)]] > (Ct vu)
when we replace each occurrence of p by r > s and each occurrence of g by -t V u.
2.2 Logical Equivalence: The Laws of Logic 61
Table 2.12
P\|@|Pp>@| PAP >@ | PAW? QI>4
0 | 0 | 0 l
0} 1 ] 0 l
1/0 0 0 1
1} 1 1 ] 1
EXAMPLE 2.11 a) Foran application of the second substitution rule, let P denote the compound statement
(p > q) > r. Because (p > q) <> 7p V q (asshownin Example 2.7 and Table 2.6),
if P,; denotes the compound statement (—p Vv q) — r, then P; <} P. (We also find
that [(p > g) > r] + [(4p V qg) > r] isa tautology.)
b) Now let P represent the compound statement (actually a tautology) p > (p Vv q).
Since —~p <> p, the compound statement P;: p —> (-—p v q) is derived from P
by replacing only the second occurrence (but not the first occurrence) of p by ——p.
The second substitution rule still implies that P, <> P. [Note that Py): =73p—>
(—7p V q), derived by replacing both occurrences of p by —7p, is also logically
equivalent to P.]
Our next example demonstrates how we can use the idea of logical equivalence together
with the laws of logic and the substitution rules.
EXAMPLE 2.12 Negate and simplify the compound statement (p V g) > r.
We organize our explanation as follows:
1) (pV q) > r <= -(pvq) Vr [by the first substitution rule because
(s + t) > (7s V f) is a tautology for primitive statements s, f].
2) Negating the statements in step (1), we have “(pv q) > r) << -7[-(@ vq) vr].
3) From the first of DeMorgan’s Laws and the first substitution rule,
TIA(P Vg) VF] = >(pv gq) Arr.
4) The Law of Double Negation and the second substitution rule now gives us
“(PV
gq) Amr = (pVgq) Arr.
From steps (1) through (4) we have ~[(p V q) > r] = (pv q)A>r.
When we wanted to write the negation of an implication, as in Example 2.12, we found
that the concept of logical equivalence played a key role — in conjunction with the laws of
logic and the substitution rules. This idea is important enough to warrant a second look.
EXAMPLE 2.13 Let p,q denote the primitive statements
p: Joan goes to Lake George. q: Mary pays for Joan’s shopping spree.
and consider the implication
p—>q: IfJoan goes to Lake George, then Mary will pay for Joan’s shopping spree.
62 Chapter 2 Fundamentals of Logic
Here we want to write the negation of p > gq ina way other than simply -(p — q). We
want to avoid writing the negation as “It is not the case that if Joan goes to Lake George,
then Mary will pay for Joan’s shopping spree.”
To accomplish this we consider the following. Since p > g <> 7p V q, it follows that
(p> q) <> -(CPp V q). Then by DeMorgan’s Law we have —(—p V g) <> -—p A-4,
and from the Law of Double Negation and the second substitution rule it follows that
4p A -q <> p A 7g. Consequently,
(Pp > gq) = ACP
V 9g) SS Tp AWG = PAA",
and we may write the negation of p — q in this case as
—(p —> q): Joan goes to Lake George, but Mary does not
pay for Joan’s shopping spree.
(Note: The negation of an if-then statement does not begin with the word if. It is not another
implication.)
In Definition 2.3 the dual s¢ of a statement s was defined only for statements involving
EXAMPLE 2.14
negation and the basic connectives A and V. How does one determine the dual of a statement
such as s: p —> q, where p, qg are primitive?
Because (p > g) <> —p V q, S@ is logically equivalent to the statement (—p Vv g)¢,
which is —p A q.
The implication p — g and certain statements related to it are now examined in the
following example.
Table 2.13 gives the truth tables for the statements p—g, -q ~> —p, g > p, and
EXAMPLE 2.15
—p — gq. The third and fourth columns of the table reveal that
(p> qa) <= (-¢ > 7p).
Table 2.13
P|@| Pm@)rvaatp |) aap) apa |e
0 | 0 l 1 1 ]
0} 1 1 l 0 0
1/0 0 0 1 ]
1] 1 ] 1 ] ]
The statement —g — —p is called the contrapositive of the implication p — g. Columns
5 and 6 of the table show that
(q> Pp) = (p> -¢).
The statement g —> p is called the converse of p + q; —p > —q is called the inverse of
p — q. We also see from Table 2.13 that
(p>qg<A(q->p) and (7p> 4G) 4 (-q > 7p).
Consequently, we must keep the implication and its converse straight. The fact that a certain
implication p —> q is true (in particular, as in row 2 of the table) does not require that the
2.2 Logical Equivalence: The Laws of Logic 63
converse g —> p also be true. However, it does necessitate the truth of the contrapositive
ag —> =p.
Let us consider a specific example where p, g represent the statements
p: Jeff is concerned about his cholesterol (HDL and LDL) levels.
q: Jeff walks at least two miles three times a week.
Then we obtain
e (The implication: p — q). If Jeff is concerned about his cholesterol levels, then he
will walk at least two miles three times a week.
e (The contrapositive: ~q + —p). If Jeff does not walk at least two miles three times a
week, then he is not concerned about his cholesterol levels.
e (The converse: gq — p). If Jeff walks at least two miles three times a week, then he is
concerned about his cholesterol levels.
e (The inverse: ~p —> —q). If Jeff is not concerned about his cholesterol levels, then he
will not walk at least two miles three times a week.
If p is true and q is false, then the implication p — g and the contrapositive -qg > —p
are false, while the converse g — p and the inverse ~p — - are true. For the case where
p is false and q is true, the implication p — g and the contrapositive -~g —» —p are now
true, while the converse g > p andthe inverse —p — -g are false. When p, g are both true
or both false, then the implication is true, as are the contrapositive, converse, and inverse.
We turn now to two examples involving the simplification of compound statements. For
simplicity, we shall list the major laws of logic being used, but we shall not mention any
applications of our two substitution rules.
| EXAMPLE 2.16 _| For primitive statements p, g, is there any simpler way to express the compound statement
(p Vg) A7(—p A gq) — that is, can we find a simpler statement that is logically equivalent
to the one given?
Here one finds that
(pV gq) A7(—p ag) Reasons
<> (PV q)A(-7p Vv 79g) DeMorgan’s Law
= (pV q)A(pv-7@q) Law of Double Negation
<=> (p V (¢q A7q) Distributive Law of V over A
<=> pv Fo Inverse Law
<=> p Identity Law
Consequently, we see that
(PV GQ) ATT PADSP,
sO we can express the given compound statement by the simpler logically equivalent state-
ment p.
Consider the compound statement
EXAMPLE 2.17
“TL Vg) Ar) vy 4),
64 Chapter 2. Fundamentals of Logic
where p, q,r are primitive statements. This statement contains four occurrences of primitive
statements, three negation symbols, and three connectives.
From the laws of logic it follows that
-—[-[(p V g) Ar] V 74] Reasons
<> —-[(p Vg) Ar] A779 DeMorgan’s Law
SS [(PVgArlag Law of Double Negation
= (pVqg)A(rag) Associative Law of A
<> (PV G)A(GAr) Commutative Law of A
S(pvgAaqglAr Associative Law of A
Sqar Absorption Law (as well as the
Commutative Laws for A and v)
Consequently, the original statement
“Ile Vg) Ar] Vv 7q]
is logically equivalent to the much simpler statement
q AP,
where we find only two primitive statements, no negation symbols, and only one connective.
Note further that from Example 2.7 we have
“llpV gq) Ar] > -q] = -I-lp Vv a Arlv -@),
so it follows that
Ar]pV
a[[( qaAr.
> mq]q)
We close this section with an application on how the ideas in Examples 2.16 and 2.17 can
be used in simplifying switching networks.
| EXAMPLE 2.18 A switching network is made up of wires and switches connecting two terminals 7; and
T>. In such a network, each switch is either open (0), so that no current flows through it, or
closed (1), so that current does flow through it.
In Fig. 2.1(a) we have a network with one switch. Each of parts (b) and (c) contains two
(independent) switches.
p
o——_ »p ———_-* — /-—@ o— p— q—-*
qT; Ts; qT, tr qT, T5
g
(a) (0) (c)
Figure 2.1
For the network in part (b), current flows from 7; to T, if either of the switches p, g is
closed. We call this a parallel network and represent it by p V q. The network in part (c)
2.2 Logical Equivalence: The Laws of Logic 65
requires that each of the switches p, g be closed in order for current to flow from 7; to 7.
Here the switches are in series; this network is represented by p A q.
The switches in a network need not act independently of each other. Consider the network
shown in Fig. 2.2(a). Here the switches labeled t and —f are not independent. We have
coupled these two switches so that ¢ is open (closed) if and only if —f is simultaneously
closed (open). The same is true for the switches at g, —g. (Also, for example, the three
switches labeled p are not independent.)
p p p
e—— 9 t 1t-—7;—e
yy qr q
r —— -g —— r r
(a) (b)
Figure 2.2
This network is represented by the statement (pVqVvr)A(pVtVv-q)A
(p V —t Vr). Using the laws of logic, we may simplify this statement as follows.
(PVqGVrnA(pVvtv~7qgA(pv-7tvr) Reasons
— PVIGVTJAEV
mg) A(t VT)] Distributive Law of V
over A
<> pVigvryattvrnAtv -@)] Commutative Law of A
<> pVI(g A~t)
Vr At Vv =9g)] Distributive Law of V
over A
<> PpVIUG Ant) Vr) A(o-t Vv aq) Law of Double Negation
<> pV (gq A-t) Vr) A(t Ag)} DeMorgan’s Law
=> pV[=(>ot Ag) A(t Ag) vr] Commutative Law of A
(twice)
=> PVM OTA QA MAG) V (At Ag An} Distributive Law of A
over V
<> pV [Fo V (ot Ag Ar] —s5 As <=> Fo, for any
statement s
<> pV (A(t Ag) Ar] Fo is the identity for Vv
=> pVvi[rAn(etAg) Commutative Law of A
<> pVir AV -7q)) DeMorgan’s Law and
the Law of Double
Negation
Hence (pVq@Vr)A(pVtVr-qA(pv ~tvr) pv <= [raA(tv—-gq)], and the net-
work shown in Fig. 2.2(b) is equivalent to the original network in the sense that current
66 Chapter 2 Fundamentals of Logic
flows from 7; to 7; in network (a) exactly when it does so in network (b). But network (b)
has only four switches, five fewer than network (a).
a) [f0+0=0, then] +1=1.
EXERCISES 2.2
b) If —1 <3 and3 +7 = 10, then sin (4) = —1.
1. Let p, g, r denote primitive statements. 10. Determine whether each of the following is true or false.
a) Use truth tables to verify the following logical equiva- Here p, g are arbitrary statements.
lences. a) An equivalent way to express the converse of “p is
sufficient for g” is “p is necessary for g.”
i) Pp? G@ANS(prgaAprr)
i) (pVgorneSelponaAdq-r)] b) An equivalent way to express the inverse of “p is
iii) [p> Vn) eS[-r> (pq) necessary for g” is “—g 1s sufficient for ~p.”
b) Use the substitution rules to show that c) An equivalent way to express the contrapositive of
‘‘p is necessary for g” is “ng is necessary for ~p.”
[P> vr) [pr-q) > 7).
11, Let p, g, andr denote primitive statements. Find a form of
2. Verify the first Absorption Law by means of a truth table.
the contrapositive of p — (gq — r) with (a) only one occurrence
3. Use the substitution rules to verify that each of the follow- of the connective —; (b) no occurrences of the connective >.
ing is a tautology. (Here p, g, and r are primitive statements.)
aIpvV(@Ar)|VoIpVv (Ar) 12. Show that for primitive statements p, g,
b) (pv qg)7>r)olcor> 7pvg)) PY ¢ = [(pA79q) Vv (=p Aq)) = A(p © q).
4. For primitive statements p, g, r, and s, simplify the com- 13. Verify that [(poegA(qeornArop)]S
pound statement [(p—> gq) A(q>r) AC p)], for primitive statements p,
g,andr.
(lpAgMArIVI(pAgdAnrrl] Vv 7g] s.
5. Negate and express each of the following statements in 14. For primitive statements p, q,
smooth English. a) verify that p > [¢ — (p Aq)] is a tautology.
a) Kelsey will get a good education if she puts her studies b) verify that (p V gq) > [q > q] is a tautology by using
before her interest in cheerleading. the result from part (a) along with the substitution rules and
b) Norma is doing her homework, and Karen is practicing the laws of logic.
her piano lessons. ¢) is (pV q) > [¢ > (pAq)Ja tautology?
c) If Harold passes his C++ course and finishes his data 15. Define the connective “Nand” or “Not... and...” by
structures project, then he will graduate at the end of the
(p tq) <= -(p Aq), for any statements p, g. Represent the
semester.
following using only this connective.
6. Negate each of the following and simplify the resulting
a) =p b) pVq c) pAg
statement.
d) p> q e) pog
a) PA(GVT)A(A=pV79qVr) 16. The connective “Nor” or “Not ... or...” is defined for
b) (PAG>r any statements p,q by (p | gq) <> -(p V q). Represent the
ce) p> (-g Ar) statements in parts (a) through (e) of Exercise 15, using only
d) pVvqgv
(7p Aq Ar) this connective.
7. a) If p, g are primitive statements, prove that 17, For any statements p, g, prove that
(“PV Q)A(PA(PNG)
= (PAQ). a) —(p 1g) = (-p t 79)
b) Write the dual of the logical equivalence in part (a). b) -(p tq) = (spl 79)
8. Write the dual for (a) g > p, (b) p> (g Ar), (C) pod, 18. Give the reasons for each step in the following simplifica-
and (d) p VY g, where p, g, and r are primitive statements. tions of compound statements.
9. Write the converse, inverse, and contrapositive of each of
a) (pV gy A(py-qgivag Reasons
the following implications. For each implication, determine its = [PVG A-qlvVq
truth value as well as the truth values of its corresponding con- => (pV Fu) vq
verse, inverse, and contrapositive. = pvg
2.3 Logical Implication: Rules of Inference 67
~F
ef p —p—r—t—
7g
i r Na |
te L p—q—-r— t
(a)
=F
Figure 2.3
b) (p> @QAlr-g A (rv 74))} Reasons 19. Provide the steps and reasons, as in Exercise 18, to establish
= (p> q)A7¢@ the following logical equivalences.
= (“pV gyAn7q a) pVIpA(PVQ)|
=P
= 7g A (7p Vq) b) pVqgV(mpArmqGAr)=Spvagvr
<= (-g Aap) Vv (7G Aq)
<> (-g Amp) V Fo ¢) [mp
V 7g) > (PAGAT)
SS PAG
= 7q A7p 20. Simplify each of the networks shown in Fig. 2.3.
= 7(q V p)
2.3
Logical Implication : Rules of Inference
At the end of Section 2.1 we mentioned the notion of a valid argument. Now we will begin
a formal study of what we shall mean by an argument and when such an argument is valid.
This in turn will help us when we investigate how to prove theorems throughout the text.
We start by considering the general form of an argument, one we wish to show is valid.
So let us consider the implication
(pi A po A p3 A+++ pn) > |:
Here n is a positive integer, the statements p1, po, p3,.--, Pn are called the premises
of the argument, and the statement g is the conclusion for the argument.
The preceding argument is called valid if whenever each of the premises p1, p2, 13, ...,
Pn is true, then the conclusion g is likewise true. [Note that if any one of
P\; P2, P3,---; Pn is false, then the hypothesis pj A p2 A p3 A--- A p, 1s false and the
implication (p; A p2 A p3 A--+A Pn) > g is automatically true, regardless of the truth
value of g.] Consequently, one way to establish the validity of a given argument is to show
that the statement (p; A p2 A p3A--:A pn) > g is a tautology.
The following examples illustrate this particular approach.
Let p, g, r denote the primitive statements given as
EXAMPLE 2.19
p. Roger studies.
q: Roger plays racketball.
r: Roger passes discrete mathematics.
68 Chapter 2 Fundamentals of Logic
Now let pi, p2, p3 denote the premises
pi: If Roger studies, then he will pass discrete mathematics.
p2: If Roger doesn’t play racketball, then he’ll study.
p3: Roger failed discrete mathematics.
We want to determine whether the argument
(pi \ p2 A p3) > |
is valid. To do so, we rewrite p1, p2, p3 as
Pur por Pz: 7G > p p3: or
and examine the truth table for the implication
(p>
r) A (7g > p) N71]
>4
given in Table 2.14. Because the final column in Table 2.14 contains all 1’s, the implication
is a tautology. Hence we can say that (p, A p2 A p3) > gq is a valid argument.
Table 2.14
A Pr P3 (pi A p2 A p3) > 4q
piqai\|r|lpor|-qopl aor | (ponalqe- paarlog
0/01! 0 l 0 l l
0; 0] 1 1 0 0) 1
0} 10 ] ] 1 ]
QO; 141 1 1 0 1
1 |0] 90 0 ] 1 1
1|/0/] 1 1 1 0 ]
1}1{/90 0 1 1 1
1/1/11 ] 1 0 1
Let us now consider the truth table in Table 2.15. The results in the last column of this table
EXAMPLE 2.20
show that for any primitive statements p, r, and s, the implication
[pPA(par)>s)]>~ (rs)
Table 2.15
Pi P2 q (Pi A pr) > 4
pir|s | par| (panos | ros | (papanos|o(ros)
0|0]0 0 1 ] ]
0 }|0} 1 0 ] ] ]
0} 10 0 1 0 ]
QO};1)]1 0 1 ] ]
1|0] 90 0 1 1 1
1 1o]{1 0 1 1 1
1 1 | 0 1 0 0 ]
1 1/1 1 l 1 1
2.3 Logical Implication: Rules of Inference 69
is a tautology. Consequently, for premises
Py: p pz: (pAr)>s
and conclusion q: (r > s), we know that (p; A p2) > q is a valid argument, and we may
say that the truth of the conclusion g is deduced or inferred from the truth of the premises
Pi, P2-
The idea presented in the preceding two examples leads to the following.
Definition 2.4 If p, qg are arbitrary statements such that p — q is a tautology, then we say that p logically
implies q and we write p => q to denote this situation.
When p, g are statements and p => q, the implication p—> q is a tautology and we
refer to p — g as a logical implication. Note that we can avoid dealing with the idea of a
tautology here by saying that p > gq (that is, p logically implies g) if g is true whenever p
is true,
In Example 2.6 we found that for primitive statements p, g, the implication p > (p Vv q)
is a tautology. In this case, therefore, we can say that p logically implies p Vv g and write
p= (pv 4q). Furthermore, because of the first substitution rule, we also find that p >
(p V q) for any statements p, g —that is, p > (p V q) is a tautology for any statements
Pp, q, whether or not they are primitive statements.
Let p, g be arbitrary statements.
1) If p <> q, then the statement p < gq is a tautology, so the statements p, g have the
same (corresponding) truth values. Under these conditions the statements p > q,
q —> p are tautologies, and we have p > g andg > p.
2) Conversely, suppose that p = q and gq => p. The logical implication p — q tells us
that we never have statement p with the truth value | and statement g with the truth
value 0. But could we have g with the truth value 1 and p with the truth value 0?
If this occurred, we could not have the logical implication g — p. Therefore, when
p= q and q => p, the statements p, g have the same (corresponding) truth values
and p <> q.
Finally, the notation p # gq is used to indicate that p — q is not a tautology — so the given
implication (namely, p —> q) is not a logical implication.
From the results in Example 2.8 (Table 2.9) and the first substitution rule, we know that for
EXAMPLE 2.21
statements p, q,
“(Pp Aq) = 7p Vv 74.
Consequently,
—(p Ag) => (spv-q) and (“pv -7q) => -(p \q)
for all statements p, g. Alternatively, because each of the implications
—(p Aq) > (=pvn~q) and (-pv-q) > -(p Aq)
is a tautology, we may also write
I-(p Aq) > (7 pV -g)] <7) and [(ap Vv -g) > 7(p A g)) = To.
70 Chapter 2 Fundamentals of Logic
Returning now to our study of techniques for establishing the validity of an argument, we
must take a careful look at the size of Tables 2.14 and 2.15. Each table has eight rows. For
Table 2.14 we were able to express the three premises p;, p2, and p3, and the conclusion
q, in terms of the three primitive statements p, g, and r. A similar situation arose for the
argument we analyzed in Table 2.15, where we had only two premises. But if we were
confronted, for example, with establishing whether
[por aAr>s)AtV
as) A (At Vu) Atul] > ap
is a logical implication (or presents a valid argument), the needed table would require
2° = 32 rows. As the number of premises gets larger and our truth tables grow to 64, 128,
256, or more rows, this first technique for establishing the validity of an argument rapidly
loses its appeal.
Furthermore, looking at Table 2.14 once again, we realize that in order to establish
whether
[(p>r)A(-q> p)A-r|l>4q
is a valid argument, we need to consider only those rows of the table where each of the three
premises p > r,—7q — p,and-r has the truth value 1. (Remember that if the hypothesis —
consisting of the conjunction of all of the premises — is false, then the implication is true
regardless of the truth value of the conclusion.) This happens only in the third row, so a
good deal of Table 2.14 is not really necessary. (It is not always the case that only one row
has all of the premises true. Note that in Table 2.15 we would be concerned with the results
in rows 5, 6, and 8.)
Consequently, what these observations are telling us is that we can possibly eliminate a
great deal of the effort put into constructing the truth tables in Table 2.14 and Table 2.15. And
since we want to avoid even larger tables, we are persuaded to develop a list of techniques
called rules of inference that will help us as follows:
1) Using these techniques will enable us to consider only the cases wherein all the
premises are true. Hence we consider the conclusion only for those rows of a truth
table wherein each premise has the truth value 1— and we do not construct the truth
table.
2) The rules of inference are fundamental in the development of a step-by-step validation
of how the conclusion q logically follows from the premises p}, p2, p3,..., Pn in
an implication of the form
(Pi A p2 A p3 A+++ A Pn) > q.
Such a development will establish the validity of the given argument, for it will show
how the truth of the conclusion can be deduced from the truth of the premises.
Each rule of inference arises from a logical implication. In some cases, the logical
implication is stated without proof. (However, several of these proofs will be dealt with in
the Section Exercises.)
Many rules of inference arise in the study of logic. We concentrate on those that we need
to help us validate the arguments that arise in our study of logic. These rules will also help
us later when we turn to methods for proving theorems throughout the remainder of the
text. Table 2.19 (on p. 78) summarizes the rules we shall now start to investigate.
2.3 Logical Implication: Rules of inference 71
For a first example we consider the rule of inference called Modus Ponens, or the Rule of
EXAMPLE 2.22
Detachment. (Modus Ponens comes from Latin and may be translated as “the method of
affirming.”) In symbolic form this rule is expressed by the logical implication
[pPA(p>
Ql] 4,
which is verified in Table 2.16, where we find that the fourth row is the only one where both
of the premises p and p — g (and the conclusion q) are true.
Table 2.16
P\@ad|p>qi| patp>@q) | (pAtpoglr-g
0 | 0 1 0 ]
0} 1 1 0 1
1|0 0 0 1
1} 1 1 1 1
The actual rule will be written in the tabular form
p
pq
|
where the three dots (.°, ) stand for the word “therefore,” indicating that g is the conclusion
for the premises p and p — q, which appear above the horizontal line.
This rule arises when we argue that if (1) p is true, and (2) p > gq is true (or p> q),
then the conclusion g must also be true. (After all, if g were false and p were true, then we
could not have p — q true.)
The following valid arguments show us how to apply the Rule of Detachment.
a) 1) Lydia wins a ten-million-dollar lottery. p
2) If Lydia wins a ten-million-dollar lottery, then Kay will quit her job. pPp-q
3) Therefore Kay will quit her job. Wg
b) 1) If Allison vacations in Paris, then she will have to win a scholarship. p->q
2) Allison is vacationing in Paris. p
3) Therefore Allison won a scholarship. Wg
Before closing the discussion on our first rule of inference let us make one final ob-
servation. The two examples in (a) and (b) might suggest that the valid argument
[p \(p > q)]— q is appropriate only for primitive statements p, g. However,
since [p A (p — q)] > q 1s a tautology for primitive statements p, g, it follows from the
first substitution rule that (all occurrences of) p or g may be replaced by compound state-
ments — and the resulting implication will also be a tautology. Consequently, if r, s, t, and
u are primitive statements, then
rvs
(r Vs) > (+t Au)
SE AU
is a valid argument, by the Rule of Detachment—just as [7 Vs)A[(rVs)—>
(—t A u)]] > (+t A u) is a tautology.
72 Chapter 2 Fundamentals of Logic
A similar situation — in which we can apply the first substitution rule — occurs for each
of the rules of inference we shall study. However, we shall not mention this so explicitly
with these other rules of inference.
A second rule of inference is given by the logical implication
EXAMPLE 2.23
(p> gAq>rl> (pr),
where p, g, and r are any statements. In tabular form it is written
p>q
G7>r
“por
This rule, which is referred to as the Law of the Syllogism, arises in many arguments. For
example, we may use it as follows:
1) If the integer 35244 is divisible by 396, then the integer 35244 is
divisible by 66. p>q
2) If the integer 35244 is divisible by 66, then the integer 35244 is
divisible by 3. q7Tr
3) Therefore, if the integer 35244 is divisible by 396, then the integer
35244 is divisible by 3. “por
The next example involves a slightly longer argument that uses the rules of inference
developed in Examples 2.22 and 2.23. In fact, we find here that there may be more than one
way to establish the validity of an argument.
Consider the following argument.
EXAMPLE 2.24
1) Rita is baking a cake.
2) If Rita is baking a cake, then she is not practicing her flute.
3) If Rita is not practicing her flute, then her father will not buy her a car.
4) Therefore Rita’s father will not buy her a car.
Concentrating on the forms of the statements in the preceding argument, we may write
the argument as
p (*)
p> —q
—q--r
2
Now we need no longer worry about what the statements actually stand for. Our objective
is to use the two rules of inference that we have studied so far in order to deduce the iruth
of the statement —r from the truth of the three premises p, p > —-g, and -qg — —r.
2.3 Logical Implication: Rules of Inference 73
We establish the validity of the argument as follows:
Steps Reasons
1) p> -g Premise
2) -q > -r Premise
3) poor This follows from steps (1) and (2) and the Law of the Syllogism
4) p Premise
5) o.-r This follows from steps (4) and (3) and the Rule of Detachment
Before continuing with a third rule of inference we shall show that the argument presented
at (*) can be validated in a second way. Here our “reasons” will be shortened to the form
we shall use for the rest of the section. However, we shall always list whatever is needed
to demonstrate how each step in an argument comes about, or follows, from prior steps.
A second way to validate the argument follows.
Steps Reasons
1) p Premise
2) p> -¢q Premise
3) =q Steps (1) and (2) and the Rule of Detachment
4) -g--r Premise
5) ..-9r Steps (3) and (4) and the Rule of Detachment
The rule of inference called Modus Tollens is given by
EXAMPLE 2.25
p~>q
— 4
“4p
This follows from the logical implication [(p > g) A —q] > —p. Modus Tollens comes
from Latin and can be translated as “method of denying.” This is appropriate because we
deny the conclusion, g, so as to prove —p. (Note that we can also obtain this rule from the
one for Modus Ponens by using the fact that p > g <> -q — —p.)
The following exemplifies the use of Modus Tollens is making a valid inference:
1) If Connie is elected president of Phi Delta sorority, then Helen will
pledge that sorority. p->q
2) Helen did not pledge Phi Delta sorority. —q
3) Therefore Connie was not elected president of Phi Delta sorority. “ap
And now we shall use Modus Tollens to show that the following argument is valid (for
primitive statements p, r,s, ft, and u).
por
rs
tVv-s
tVu
uu
J. tp
Both Modus Tollens and the Law of the Syllogism come into play, along with the logical
equivalence we developed in Example 2.7.
74 Chapter 2 Fundamentals of Logic
Steps Reasons
1) pornr-s Premises
2) pos Step (1) and the Law of the Syllogism
3) tVv-7s Premise
4) -svt Step (3) and the Commutative Law of v
5) sot Step (4) and the fact that ~s vt <> 5 +f
6) pot Steps (2) and (5) and the Law of the Syllogism
7) -tVvu Premise
8) t>u Step (7) and the fact that ~t Vu = t-> u
9) p> u Steps (6) and (8) and the Law of the Syllogism
10) =u Premise
11) ..-p Steps (9) and (10) and Modus Tollens
Before continuing with another rule of inference let us summarize what we have just
accomplished (and not accomplished). The preceding argument shows that
[(prornaArosyAGV
7S) A (+t Vu) Au] > Ap.
We have not used the laws of logic, as in Section 2.2, to express the statement
(pornaA(roas\AEV-AS)A
(Ft Vu)A-Uu
as a simpler logically equivalent statement. Note that
[(prrnArros)AGV
7S) A (Ht Vu)Arul A> ap.
For when p has the truth value 0 and u has the truth value 1, the truth value of —p is 1 while
that of ~u and (p> r)A(r > s)A(t V ms) A (+t V 4) A mu is 0.
Let us once more examine a tabular form for each of the two related rules of inference,
Modus Ponens and Modus Tollens.
Modus Ponens: p-—> gq Modus Tollens: p— q
Pp 74
a ap
os q
The reason we wish to do this is that there are other tabular forms that may arise — and
these are similar in appearance but present invalid arguments — where each of the premises
is true but the conclusion is false.
a) Consider the following argument:
1) If Margaret Thatcher is the president of the United States, then
she is at least 35 years old. pq
2) Margaret Thatcher is at least 35 years old. q
3) Therefore Margaret Thatcher is the president of the United States. “.?p
Here we find that [(p — g) A q]— p isnot a tautology. For if we consider the truth
value assignments p: 0 and qg: 1, then each of the premises p — g and g is true
while the conclusion p is false. This invalid argument results from the fallacy
(error in reasoning) where we try to argue by the converse—that is, while
[(p > gq) A p] = @q, itis not the case that [(p > g) Aq] > p.
2.3 Logical implication: Rules of Inference 75
b) Asecond argument where the conclusion doesn’t necessarily follow from the premises
may be given by:
1) If2+3=6, then2+4=6. pq
2) 2+3 #6. m2
3) Therefore 2 + 4 # 6. vq
In this case we find that [(p — q) Ap] > 774 is not a tautology. Once again
the truth value assignments p: 0 and q: | show us that the premises p > g and —p
can both be true while the conclusion —gq is false. The fallacy behind this invalid
argument arises from our attempt to argue by the inverse—for although
[(p > gq) Aq] => —p, it does not follow that [(p > g) A =p] > 79g.
Before proceeding further we now mention a rather simple but important rule of infer-
ence.
The following rule of inference arises from the observation that if p, g are true statements,
EXAMPLE 2.26
then p A q is a true statement.
Now suppose that statements p, g occur in the development of an argument. These
statements may be (given) premises or results that are derived from premises and/or from
results developed earlier in the argument. Then under these circumstances the two statements
p,q can be combined into their conjunction p A q, and this new statement can be used in
later steps as the argument continues.
We call this rule the Rule of Conjunction and write it in tabular form as
p
q
..DAq
As we proceed further with our study of rules of inference, we find another fairly simple
but important rule.
The following rule of inference — one we may feel just illustrates good old common sense —
EXAMPLE 2.27
is called the Rule of Disjunctive Syllogism. This rule comes about from the logical impli-
cation
[((pVq)A7p]>gq,
which we can derive from Modus Ponens by observing that p Vv g <> —p > q.
In tabular form we write
PY
—P
26g
This rule of inference arises when there are exactly two possibilities to consider and we are
able to eliminate one of them as being true. Then the other possibility has to be true. The
following illustrates one such application of this rule.
1) Bart’s wallet is in his back pocket or it is on his desk. DV
2) Bart’s wallet is not in his back pocket. =p
3) Therefore Bart’s wallet is on his desk. Og
76 Chapter 2. Fundamentals of Logic
At this point we have examined five rules of inference. But before we try to validate any
more arguments like the one (with 11 steps) in Example 2.25, we shall look at one more
of these rules. This one underlies a method of proof that is sometimes confused with the
contrapositive method (or proof) given in Modus Tollens. The confusion arises because
both methods involve the negation of a statement. However, we will soon realize that these
are two distinct methods. (Toward the end of Section 2.5 we shall compare and contrast
these two methods once again.)
Let p denote an arbitrary statement, and Fo a contradiction. The results in column 5 of Table
EXAMPLE 2.28
2.17 show that the implication (~p — Fo) ~ p is a tautology, and this provides us with
the rule of inference called the Rule of Contradiction. In tabular form this rule is written as
=p > Fo
Pp
Table 2.17
P| 7p | ho | ~poh | (iH po
kh) p
1 | 0 1 1
©
O} J 0 0 1
This rule tells us that if p is astatement and =p —> Fp is true, then — p must be false because
Fo is false. So then we have p true.
The Rule of Contradiction is the basis of a method for establishing the validity of an
argument — namely, the method of Proof by Contradiction, or Reductio ad Absurdum. The
idea behind the method of Proof by Contradiction is to establish a statement (namely, the
conclusion of an argument) by showing that, if this statement were false, then we would
be able to deduce an impossible consequence. The use of this method arises in certain
arguments which we shall now describe.
In general, when we want to establish the validity of the argument
(pi A p2 A+++
A pn) > q,
we can establish the validity of the logically equivalent argument
(pi A pr A+++ A py Amq) > Fo.
[This follows from the tautology in column 7 of Table 2.18 and the first substitution rule —
where we replace the primitive statement p by the statement (p; A p2 A+++ A p,)'.]
Table 2.18
P\q@'\ Fo | pAanwq | (PA7Q™M—>h | po! (pogel(pa-7q—
Fi)
0101; 0 0 l 1 1
Oo] 1 0 0 l 1 1
1/0] 0 1 0 0 ]
1) 1 0 0 I 1 1
"In Section 4.2 we shall provide the reason why we know that for any statements p), p2 weeny Pn, and q, it
follows that (pi A p2 A---A
pn) AmG = pl A pr A-+-A
pa AnQ.
2.3 Logical Implication: Rules of Inference 77
When we apply the method of Proof by Contradiction, we first assume that what we are
trying to validate (or prove) is actually false. Then we use this assumption as an additional
premise in order to produce a contradiction (or impossible situation) of the form s A —s, for
some statement s. Once we have derived this contradiction we may then conclude that the
statement we were given was in fact true — and this validates the argument (or completes
the proof).
We shall turn to the method of Proof by Contradiction when it is (or appears to be) easier
to use 7g in conjunction with the premises p,, P2,.-.., Pp Mm order to deduce a contradiction
than itis to deduce the conclusion g directly from the premises p), p2,..., Pn. The method
of Proof by Contradiction will be used in some of the later examples for this section—
namely, Examples 2.32 and 2.35. We shall also find it frequently reappearing in other
chapters in the text.
Now that we have examined six rules of inference, we summarize these rules and intro-
duce several others in Table 2.19 (on the following page).
The next five examples will present valid arguments. In so doing, these examples will
show us how to apply the rules listed in Table 2.19 in conjunction with other results, such
as the laws of logic.
Our first example demonstrates the validity of the argument
EXAMPLE 2.29
p7r
“pq
qs
ors
Steps Reasons
1) por Premise
2) -r—> 4p Step (1) and p> r <> -7r—> ap
3) =p>q Premise
4) -r>q Steps (2) and (3) and the Law of the Syllogism
5) q->>s Premise
6) ..-r>s Steps (4) and (5) and the Law of the Syllogism
A second way to validate the given argument proceeds as follows.
Steps Reasons
1) per Premise
2)g->s Premise
3) -=p>q Premise
4) pvq Step (3) and (-~p > q) <> (-7~p V q) = (p Vq), where the
second logical equivalence follows by the Law of Double Negation
5rvs Steps (1), (2), and (4) and the Rule of the Constructive Dilemma
6) ..-r—>s Step (S) and (r Vs) < (--7r Vs) & (-r > 5), where the Law of
Double Negation is used in the first logical equivalence
The next example is somewhat more involved.
Chapter 2. Fundamentals of Logic
Table 2.19
Rule of Inference Related Logical Implication Name of Rule
1) p IpPA(p>qQl->4 Rule of Detachment
pq (Modus Ponens)
2g
2) pq [(p-gaA@dr>nl7~ wr) Law of the Syllogism
q ~r
..por
3) pq (p> 4q)A7q) > =p Modus Tollens
—7q
“ap
4) p Rule of Conjunction
q
“DAG
5) pvq [pV q)A7pl|> 4 Rule of Disjunctive
7P Syllogism
a!
6) —=p—> Fo (=p > Fo) > p Rule of
“.i~pP Contradiction
7) pA (p\q) > Pp Rule of Conjunctive
J. ?p Simplification
8) p p> pPpVvgq Rule of Disjunctive
PVG Amplification
9) pAg I(pAgA[p>@q@>r)]]r-r Rule of Conditional
p> q>r) Proof
?
10) per lponaAqornlolpvg—-r] Rule for Proof
q7r by Cases
(PVgo>r
11) p-q [(prqQgdArosA(pvr]->@|vs) Rule of the
r>s Constructive
pvr Dilemma
QNVs
12) p->q (p> g) A (F > 8s) A (7G V 75)])> ("pv 7r) Rule of the
r—>s Destructive
—™q V 7S Dilemma
apV-r
EXAMPLE 2.30 Establish the validity of the argument
pq
G7 {(rAs)
—r V (=t Vu)
pAt
OU
2.3 Logical Implication: Rules of Inference 79
Steps Reasons
1) p-g Premise
2) g>(rASs) Premise
3) po (ras) Steps (1) and (2) and the Law of the Syllogism
4) pat Premise
5) p Step (4) and the Rule of Conjunctive Simplification
6) rAs Steps (5) and (3) and the Rule of Detachment
7) r Step (6) and the Rule of Conjunctive Simplification
8) -r V (-t Vu) Premise
9) -(rAt)Vu Step (8), the Associative Law of v, and DeMorgan’s Laws
10) ¢ Step (4) and the Rule of Conjunctive Simplification
11) rAt Steps (7) and (10) and the Rule of Conjunction
12) -.u Steps (9) and (11), the Law of Double Negation, and the
Rule of Disjunctive Syllogism
This example will provide a way to show that the following argument is valid.
EXAMPLE 2.31
If the band could not play rock music or the refreshments were not delivered
on time, then the New Year’s party would have been canceled and Alicia would
have been angry. If the party were canceled, then refunds would have had to be
made. No refunds were made.
Therefore the band could play rock music.
First we convert the given argument into symbolic form by using the following statement
assignments:
The band could play rock music.
SQ DY
The refreshments were delivered on time.
The New Year’s party was canceled.
Alicia was angry.
&
Refunds had to be made.
—~
The argument above now becomes
(=p V7q)—> (r As)
rot
—t
“.?p
We can establish the validity of this argument as follows.
Steps Reasons
1) rot Premise
2) -t Premise
3) =r Steps (1) and (2) and Modus Tollens
4) -rv-s Step (3) and the Rule of Disjunctive Amplification
5) -(r As) Step (4) and DeMorgan’s Laws
6) (=pV-7q)—> (ras) Premise
7) =(4p V -q) Steps (6) and (5) and Modus Tollens
8) pAg Step (7), DeMorgan’s Laws, and the Law of Double
Negation
9) -.p Step (8) and the Rule of Conjunctive Simplification
Chapter 2 Fundamentals of Logic
In this instance we shall use the method of Proof by Contradiction. Consider the argument
EXAMPLE 2.32
“Pq
qr
—r
“p
To establish the validity for this argument, we assume the negation — p of the conclusion p
as another premise. The objective now is to use these four premises to derive a contradiction
Fo. Our derivation follows.
Steps Reasons
1) ~peg Premise
2) (-p>q)A@>nap) Step (and (-p 4g) > [p> qgA@-> -p))
3) -p-> gq Step (2) and the Rule of Conjunctive Simplification
4)q-r Premise
5) ap>r Steps (3) and (4) and the Law of the Syllogism
6) —p Premise (the one assumed)
7) r Steps (5) and (6) and the Rule of Detachment
8) -—r Premise
9) rAmr(<> Fo) Steps (7) and (8) and the Rule of Conjunction
10) -.p Steps (6) and (9) and the method of Proof by
Contradiction
If we examine further what has happened here, we find that
(mp
eo MAG r) Avr Amp) => Fo.
This requires the truth value of [((~p<q) A(q —-r)A-rA-—p] to be 0. Because
=p <q,q-—-r, and —r are the given premises, each of these statements has the truth
value 1. Consequently, for [(->p — g) A (¢ > r) Amr A —p]to have the truth value 0, the
statement — p must have the truth value 0. Therefore p has the truth value 1, and the conclu-
sion p of the argument is true.
Before we consider our next example, we need to examine columns 5 and 7 of Table
2.20. These identical columns tell us that for primitive statements p, qg, andr,
Ip>(q7enl] elprgd-rl.
Using the first substitution rule, let us replace each occurrence of p by the compound
statement (p; A p2 A+++ A p,). Then we obtain the new result
[(pLA p2A-++A Pa) > (g > r)] = (Cp A po A-+* A Pn AG) > ry).
*In Section 4.2 we shall present a formal proof of why
(pi A pr A+++ A Pad AG = PLA prArr+A pa Aq.
2.3 Logical Implication: Rules of inference 81
Table 2.20
P\@i\r\ pag | (pagror|qrar) podg-nr)
0,0) 0 Q 1 ] ]
0/0} 1 0 ] ] ]
QQ; 1/0 0 1 ) 1
Oo; 1] 1 0 1 1 ]
1 | 0] 0 0 1 l 1
1/0] 1 0 ] ] 1
1/;1/0 1 0 0 0
|e oe ] ] 1 ]
This result tells us that if we wish to establish the validity of the argument (*) we may be
able to do so by establishing the validity of the corresponding argument (**).
(*) Pi (**) P\
P2 P2
Pr Pn
gar g
OF
After all, suppose we want to show that g —>r has the truth value 1, when each of
P1, P2,.-+, Pn does. If the truth value for g is 0, then there is nothing left to do, since
the truth value for g — r is 1. Hence the real problem is to show that g — r has truth
value 1, when each of p1, p2,..., Pn, and g does — that is, we need to show that when
Pi, P2, +++, Pay g each have truth value 1, then the truth value of r is 1.
We demonstrate this principle in the next example.
In order to establish the validity of the argument
EXAMPLE 2.33
(*) u>r
(rAs)> (pvt)
q—>{uAs)
—f
“.G—> p
we consider the corresponding argument
(**) u—or
(r As)
> (pvt)
q > UAS)
t
q
2p
[Note that g is the hypothesis of the conclusion g — p for argument (*) and that it becomes
another premise for argument (**) where the conclusion is p.]
82 Chapter 2. Fundamentals of Logic
To validate the argument (**) we proceed as follows.
Steps Reasons
1) g Premise
2) go> As) Premise
3) uAS Steps (1) and (2) and the Rule of Detachment
4) u Step (3) and the Rule of Conjunctive Simplification
5) u-r Premise
6) r Steps (4) and (5) and the Rule of Detachment
7) s Step (3) and the Rule of Conjunctive Simplification
8) ras Steps (6) and (7) and the Rule of Conjunction
9) (rAs)—>(pVt) Premise
10) pvt Steps (8) and (9) and the Rule of Detachment
11) -t Premise
12) ..p Steps (10) and (11) and the Rule of Disjunctive Syllogism
We now know that for argument (**)
[(us>ryAlras)> (pvt Alg> UAS)
At Ag] => p,
and for argument (*) it follows that
(us rnalras)>
py) Alg > UAs) Amt] > @- p).
Examples 2.29 through 2.33 have given us some idea of how to establish the validity
of an argument. Following Example 2.25 we discussed two situations indicating when an
argument is invalid — namely, when we try to argue by the converse or the inverse. So now
it is time for us to learn a little more about how to determine when an argument is invalid.
Given an argument
Pi
P2
P3
Pn
ig
we Say that the argument is invalid if it is possible for each of the premises p), p2, p3,..-
p, to be true (with truth value 1), while the conclusion g is false (with truth value 0).
The next example illustrates an indirect method whereby we may be able to show that
an argument we feel is invalid (perhaps because we cannot find a way to show that it is
valid) actually is invalid.
Consider the primitive statements p, g, r,s, and t and the argument
EXAMPLE 2.34
p
pVq
q—>(r->s)
t—-r
To show that this is an invalid argument, we need one assignment of truth values for each
of the statements p, g, r,s, and t such that the conclusion —s —> —t is false (has the truth
value 0) while the four premises are all true (have the truth value 1). The only time the
2.3 Logical Implication: Rules of Inference 83
conclusion —s — —f is false is when —s is true and —t is false. This implies that the truth
value for s is 0 and that the truth value for ¢ is 1.
Because p is one of the premises, its truth value must be 1. For the premise p Vv q to
have the truth value 1, g may be either true (1) or false (0). So let us consider the premise
t —> r where we know that ¢ is true. If f > r is to be true, then r must be true (have the
truth value |). Now with rs true (1) and s false (0), it follows that r — s is false (0), and that
the truth value of the premise g — (r > s) will be 1 only when g is false (0).
Consequently, under the truth value assignments
p: 1 g: O | s: 0 t: 1,
the four premises
P pVq q>(r-s) tor
all have the truth value 1, while the conclusion
7S —> —f
has the truth value 0. In this case we have shown the given argument to be invalid.
The truth value assignments p: 1, g: 0, r: 1, 5:0, and t: 1 of Example 2.34 provide one
case that disproves what we thought might have been a valid argument. We should now
start to realize that in trying to show that an implication of the form
(p1 A p2 A p3A+++A
Pra) >|
presents a valid argument, we need to consider all cases where the premises p1, p2, P3,---,
P, ate true. [Each such case is an assignment of truth values for the primitive statements
(that make up the premises) where pj, p2, P3, .--, Pn are true.] In order to do so— namely,
to cover the cases without writing out the truth table — we have been using the rules of
inference together with the laws of logic and other logical equivalences. To cover all the
necessary cases, we cannot use one specific example (or case) as a means of establishing
the validity of the argument (for all possible cases). However, whenever we wish to show
that an implication (of the preceding form) is not a tautology, all we need to find is one
case for which the implication is false— that is, one case in which all the premises are true
but the conclusion is false. This one case provides a counterexample for the argument and
shows it to be invalid.
Let us consider a second example wherein we try the indirect approach of Example 2.34.
What can we say about the validity or invalidity of the following argument? Here p, gq, r,
EXAMPLE 2.35
and s denote primitive statements.)
pq
q->s
r>ss
—pYr
Sp
Can the conclusion —p be false while the four premises are all true? The conclusion —p
is false when p has the truth value 1. So for the premise p — gq to be true, the truth value
of g must be 1. From the truth of the premise g — 5s, the truth of g forces the truth of
s. Consequently, at this point we have statements p, g, and s all with the truth value 1.
84 Chapter 2 Fundamentals of Logic
Continuing with the premise r + —s, we find that because s has the truth value 1, the truth
value of r must be 0. Hence r is false. But with —p false and the premise —p Y r true, we
also have r true. Therefore we find that p > (—r Ar).
We have failed in our attempt to find a counterexample to the validity of the given
argument. However, this failure has shown us that the given argument is valid
— and the
validity follows by using the method of Proof by Contradiction.
This introduction to the rules of inference has been far from exhaustive. Several of the
books cited among the references listed near the end of this chapter offer additional material
for the reader who wishes to pursue this topic further. In Section 2.5 we shall apply the ideas
developed in this section to statements of a more mathematical nature. For we shall want to
learn how to develop a proof for a theorem. And then in Chapter 4 another very important
proof technique called mathematical induction will be added to our arsenal of weapons for
proving mathematical theorems. First, however, the reader should carefully complete the
exercises for this section.
b) If Brady solved the first problem correctly, then the an-
swer he obtained is 137.
Brady’s answer to the first problem is not 137.
1. The following are three valid arguments. Establish the va-
lidity of each by means of a truth table. In each case, determine
which rows of the table are crucial for assessing the validity of c) If this is a repeat-until loop, then the body of this loop
the argument and which rows can be ignored. is executed at least once,
a [pA(p>Q@AriI>((pvg>r] .. The body of the loop is executed at least once.
b) [[(p Aq) > rl] A794 A (p> -r)] > (pv 79) d) If Tim plays basketball in the afternoon, then he will not
ce) [[pPV (gv r)] A 7g] > (pvr) watch television in the evening.
2. Use truth tables to verify that each of the following is a
logical implication. .’. Tim didn’t play basketball in the afternoon.
a) ((pogAq>rn)|o(por) 5. Consider each of the following arguments. If the argument
is valid, identify the rule of inference that establishes its validity.
b) (p> q)A7q]> 7p If not, indicate whether the error is due to an attempt to argue
ce) (pV gq) A7p)> 4 by the converse or by the inverse.
d) pon a@g>rlolpyvqg-r) a) Andrea can program in C++, and she can program in
3. Verify that each of the following is a logical implication by Java.
showing that it is impossible for the conclusion to have the truth Therefore Andrea can program in C++.
value 0 while the hypothesis has the truth value 1. b) A sufficient condition for Bubbles to win the golf tour-
a) (pAq)>p nament is that her opponent Meg not sink a birdie on the
b) p> (pvq) last hole.
Bubbles won the golf tournament.
ec) (pVq)A7p)>4 Therefore Bubbles’ opponent Meg did not sink a birdie on
d) (peQgMaras)A(pvrnj>@vs) the last hole.
e) (p> ga A> s)A ("GV 75)) > (4p V7) c) If Ron’s computer program is correct, then he’ll be able
4. For each of the following pairs of statements, use Modus to complete his computer science assignment in at most two
Ponens or Modus Tollens to fill in the blank line so that a valid hours.
argument is presented. It takes Ron over two hours to complete his computer sci-
a) If Janice has trouble starting her car, then her daughter ence assignment.
Angela will check Janice’s spark plugs. Therefore Ron’s computer program is not correct.
Janice had trouble starting her car. d) Eileen’s car keys are in her purse, or they are on the
kitchen table.
2.3 Logical Implication: Rules of Inference 85
Eileen’s car keys are not on the kitchen table. 9. a) Give the reasons for the steps given to validate the
Therefore Eileen’s car keys are in her purse. argument
e) If interest rates fall, then the stock market will rise. [p> g@ Alor Vs)A(pVr)] > (-q > 5).
Interest rates are not falling.
Therefore the stock market will not rise. Steps Reasons
1) -—(-q > s)
6. For primitive statements p, g, and r, let P denote the
2) -g A7s
Statement
3) x5
[PA@ Ar) V-IpV@Ar)), 4) -=rvs
while P; denotes the statement 5) =r
6) p>q
[IPA Vr] V-lpy @vr)).
7) 7q
a) Use the rules of inference to show that 8) =p
qAraaqgyvr. 9) pvr
10) r
b) Is it true that P => P,?
ll) -rar
7. Give the reason(s) for each step needed to show that the 12) ..-g>s
following argument is valid.
b) Give a direct proof for the result in part (a).
[PAP > QA VIA (> 7q)] > (8 V8)
c) Give a direct proof for the result in Example 2.32.
Steps Reasons 10. Establish the validity of the following arguments.
a) [((pA~q) Ar]> [pAr)vg)
b) [PA(P> QM ACaVNI>r
4) r>-qg c) p->@ d) p-gq
5) g>>r 7q rq
6) =r
eh SS
7) sVr SO(pvr) /.ap
8) s
e) po (qr) f) pag
9) svt
7q > 7p p—>(rAq)
8. Give the reasons for the steps verifying the following Pp r—>(svf)
argument. Ur ss
(—=pVvq)>r Je
r—>(s Vt)
g) po(q>r) h) pvq
7S A TH pV Ss —pVr
=u > at f{—>q Tr
.?p as a.
or ot
Steps Reasons
1) -s Au 11. Show that each of the following arguments is invalid by
2) 74 providing a counterexample — that is, an assignment of truth
3) -u—> -t values for the given primitive statements p, g, r, and s such
4) -t that all premises are true (have the truth value 1) while the con-
5) 7-5 clusion is false (has the truth value 0).
6) -—s Ant a) [((pA7q) A[p> (¢>r)\]>
7r
7) r>(sVt)
b) [[(pAg >rl]A(-qvr)]—> p
8) -(s Vt)7> Fr
9) (-s Ant) > ar c) peg d) p
q7r pwr
10) -,r
rV-7s p-o(qvr-r)
11) (-pvq)-or
45 > q aq
V 78
12) -r — -(-p vq)
Ss 8
13) -r > (pA-7q)
14) pA-@
15) ..p
Chapter 2 Fundamentals of Logic
12. Write each of the following arguments in symbolic form. clauses (p V q) and (7p Vr) as premises and the clause
Then establish the validity of the argument or give a counter- (g¢ Vr) asits conclusion (or, resolvent), Should we have the
example to show that it is invalid. premise >(p A q), we replace this by the logically equiva-
a) If Rochelle gets the supervisor’s position and works lent clause —p V —q, by the first of DeMorgan’s Laws. The
hard, then she’ get a raise. If she gets the raise, then she’ll premise —(p V q) can be replaced by the two clauses —p,
buy a new car. She has not purchased a new car. Therefore —q. This is due to the second DeMorgan Law and the Rule
either Rochelle did not get the supervisor’s position or she of Conjunctive Simplification. For the premise p Vv (¢ Ar),
did not work hard. we apply the Distributive Law of Vv over A and the Rule
of Conjunctive Simplification to arrive at either of the two
b) If Dominic goes to the racetrack, then Helen will be mad.
clauses p V q, p Vr. Finally, the premise p — qg becomes
If Ralph plays cards all night, then Carmela will be mad. If
the clause ~p V q.
either Helen or Carmela gets mad, then Veronica (their at-
Establish the validity of the following arguments, using
torney) will be notified. Veronica has not heard from either
resolution (along with the rules of inference and the laws
of these two clients. Consequently, Dominic didn’t make it
of logic).
to the racetrack and Ralph didn’t play cards all night.
(i) pVv(@qar) Ce 2
c) Ifthere is a chance of rain or her red headband is missing, ps pq
then Lois will not mow her lawn. Whenever the tempera- JFVS wg
ture is over 80°F, there is no chance for rain. Today the
(ili) pvgq (iv) -mpVvqvr
temperature is 85°F and Lois is wearing her red headband.
Therefore (sometime today) Lois will mow her lawn.
por 7q
ros ar
13. a) Given primitive statements p,g,7r, show that the “QNVS “.ap
implication (v) <pVs
(pVgyA(opvrnj>
@vr) —tV(sAr)
is a tautology. mq Vr
pVGNVt
b) The tautology in part (a) provides the rule of inference
SINS
known as resolution, where the conclusion (g¢ V r) is called
the resolvent. This rule was proposed in 1965 by J. A. Robin- c) Write the following argument in symbolic form, then
son and is the basis of many computer programs designed use resolution (along with the rules of inference and the
to automate a reasoning system. laws of logic) to establish its validity.
In applying resolution each premise (in the hypothe- Jonathan does not have his driver’s license or his new
sis) and the conclusion are written as clauses. A clause is car is out of gas. Jonathan has his driver’s license or he does
a primitive statement or its negation, or it is the disjunc- not like to drive his new car. Jonathan’s new car is not out
tion of terms each of which is a primitive statement or the of gas or he does not like to drive his new car. Therefore,
negation of such a statement. Hence the given rule has the Jonathan does not like to drive his new car.
2.4
The Use of Quantifiers
In Section 2.1, we mentioned how sentences that involve a variable, such as x, need not
be statements. For example, the sentence “The number x + 2 is an even integer” is not
necessarily true or false unless we know what value is substituted for x. If we restrict our
choices to integers, then when x is replaced by —5, —1, or 3, for instance, the resulting
statement is false. In fact, it is false whenever x is replaced by an odd integer. When an
even integer is substituted for x, however, the resulting statement is true.
We refer to the sentence “The number x + 2 is an even integer” as an open statement,
which we formally define as follows.
Definition 2.5 A declarative sentence is an open statement if
1) it contains one or more variables, and
2.4 The Use of Quantifiers 87
2) it is not a statement, but
3) it becomes a statement when the variables in it are replaced by certain allowable
choices.
When we examine the sentence “The number x + 2 is an even integer” in light of
this definition, we find it is an open statement that contains the single variable x. With
regard to the third element of the definition, in our earlier discussion we restricted the
“certain allowable choices” to integers. These allowable choices constitute what is called
the universe or universe of discourse for the open statement. The universe comprises the
choices we wish to consider or allow for the variables) in the open statement. (The universe
is an example of a set, a concept we shall examine in some detail in the next chapter.)
In dealing with open statements, we use the following notation:
The open statement “The number x + 2 is an even integer” is denoted by p(x) [or g(x),
r(x), etc.]. Then —p(x) may be read “The number x + 2 is nof an even integer.”
We shall use g(x, y) to represent an open statement that contains two variables. For
example, consider
q(x, y): | The numbers y + 2, x — y, and x + 2y are even integers.
In the case of g(x, y), there is more than one occurrence of each of the variables x, y. It is
understood that when we replace one of the x’s by a choice from our universe, we replace
the other x by the same choice. Likewise, when a substitution (from the universe) is made
for one occurrence of y, that same substitution is made for all other occurrences of the
variable y.
With p(x) and g(x, y) as above, and the universe still stipulating the integers as our only
allowable choices, we get the following results when we make some replacements for the
variables x, y.
p(5): The number 7(= 5 + 2) is an even integer. (FALSE)
—p(7): The number 9 is not an even integer. (TRUE)
q(4, 2): The numbers 4, 2, and 8 are even integers. (TRUE)
We also note, for example, that g(5, 2) and g(4, 7) are both false statements, whereas
aq (5, 2) and gq (4, 7) are true.
Consequently, we see that for both p(x) andg(x, y), as already given, some substitutions
result in true statements and others in false statements. Therefore we can make the following
true statements.
1) For some x, p(x).
2) For some x, y, g(x, y).
Note that in this situation, the statements “For some x, ~p(x)” and “For some x, y,
—=q(x, y)” are also true. [Since the statements “For some x, p(x)” and “For some x, —p(x)”
are both true, we realize that the second statement is not the negation of the first— even
though the open statement — p(x) is the negation of the open statement p(x). And a similar
result is true for the statements involving g(x, y) and ~q(x, y).]
The phrases “For some x” and “For some x, y” are said to quantify the open statements
p(x) and g(x, y), respectively. Many postulates, definitions, and theorems in mathematics
involve statements that are quantified open statements. These result from the two types of
quantifiers, which are called the existential and the universal quantifiers.
Chapter 2. Fundamentals of Logic
Statement (1) uses the existential quantifier “For some x,” which can also be expressed
as “For at least one x” or “There exists an x such that.” This quantifier is written in symbolic
form as dx. Hence the statement “For some x, p(x)” becomes 4x p(x), in symbolic form.
Statement (2) becomes 4x Ay g(x, y) in symbolic form. The notation 4x,y can be used
to abbreviate dx dy g(x, y) to dx,y g(x, y).
The universal quantifier is denoted by Vx and is read “For all x,” “For any x,” “For each
x,” or “For every x.” “For all x, y,” “For any x, y,” “For every x, y,” or “For all x and y”
is denoted by Vx Vy, which can be abbreviated to Vx, y.
Taking p(x) as defined earlier and using the universal quantifier, we can change the open
statement p(x) into the (quantified) statement Vx p(x), a false statement.
If we consider the open statement r(x): “2x is an even integer” with the same universe
(of all integers), then the (quantified) statement Vx r(x) is a true statement. When we say
that Vx r(x) is true, we mean that no matter which integer (from our universe) is substituted
for x in r(x), the resulting statement is true. Also note that the statement 4x r(x) is a true
statement, whereas Vx —r(x) and 4x —r(x) are both false.
The variable x in each of open statements p(x) and r(x) is called a free variable (of
the open statement). As x varies over the universe for an open statement, the truth value
of the statement (that results upon the replacement of each occurrence of x) may vary.
For instance, in the case of p(x), we found p(5) to be false — while p(6) turns out to be
a true statement. The open statement r(x), however, becomes a true statement for every
replacement (for x) taken from the universe of all integers. In contrast to the open statement
p(x) the statement 4x p(x) has a fixed truth value—namely, true. And in the symbolic
representation 4x p(x) the variable x is said to be a bound variable — it is bound by the
existential quantifier 4. This is also the case for the statements Wx r(x) and Vx ~r(x), where
in each case the variable x is bound by the universal quantifier V.
For the open statement g(x, y) we have two free variables, each of which is bound by
the quantifier A in either of the statements dx Ay g(x, y) or dx,y q(x, y).
The following example shows how these new ideas about quantifiers can be used in
conjunction with the logical connectives.
Here the universe comprises all real numbers. The open statements p(x), g(x), r(x), and
EXAMPLE 2.36
s(x) are given by
p(x): x >=O0 r(x): x*-—3x-4=0
g(x): x? >0 s(x): x7 -3>0.
Then the following statements are true.
1) dx [p(x) Ar(x)]
This follows because the real number 4, for example, is a member of the universe and is
such that both of the statements p(4) and r(4) are true.
2) Vx [p(x) > ¢(x)]
If we replace x in p(x) by a negative real number a, then p(a) is false, but p(a) > g(a)
is true regardless of the truth value of g(a). Replacing x in p(x) by a nonnegative real
number », we find that p(b) and g(b) are both true, as is p(b) — g(b). Consequently,
p(x) — q(x) is true for all replacements x taken from the universe of all real numbers, and
the (quantified) statement Vx [ p(x) > q¢(x)] is true.
This statement may be translated into any of the following:
a) For every real number x, if x > 0, then x? > 0.
2.4 The Use of Quantifiers 89
b) Every nonnegative real number has a nonnegative square.
c) The square of any nonnegative real number is a nonnegative real number.
d) All nonnegative real numbers have nonnegative squares.
Also, the statement 4x [ p(x) > q(x)] is true.
The next statements we examine are false.
1’) Vx [g(x) > s(x)]
We want to show that the statement is false, so we need exhibit only one counterexample —
that is, one value of x for which q(x) > s(x) is false—rather than prove something
for all x as we did for statement (2). Replacing x by 1, we find that g(1) is true and
s(1) is false. Therefore g(1) > s(1) is false, and consequently the (quantified) statement
Vx [¢(x) > s(x)] is false. [Note that x = 1 does not produce the only counterexample:
Every real number a between —/3 and 3 will make g(a) true and s(a) false.]
2’) Vx [r(x) V s(x)]
Here there are many values for x, such as 1, 5s1 —3, and 0, that produce counterexamples.
Upon changing quantifiers, however, we find that the statement 4x [r(x) V s(x)] is true.
3) Vx [r(x) > p(x)]
The real number —1 is a solution of the equation x7 — 3x — 4 = 0, so r(—1) is true while
p(—1) 1s false. Therefore the choice of —1 provides the unique counterexample we need
to show that this (quantified) statement is false.
Statement (3’) may be translated into either of the following:
a) For every real number x, if x7 — 3x — 4 = 0, then x > 0.
b) For every real number x, if x is a solution of the equation x? — 3x — 4 = 0, then
x > 0.
Now we make the following observations. Let p(x) denote any open statement (in the
variable x) with a prescribed nonempty universe (that is, the universe contains at least one
member). Then if Vx p(x) is true, so is Sx p(x), or
Vx p(x) > dx p(x).
When we write Vx p(x) > dx p(x) we are saying that the implication Vx p(x) >
dx p(x) is a logical implication — that is, dx p(x) is true whenever Vx p(x) is true. Also,
we realize that the hypothesis of this implication is the quantified statement Vx p(x), and
the conclusion is dx p(x), another quantified statement. On the other hand, it does not
follow that if dx p(x) is true, then Vx p(x) must be true. Hence 4x p(x) does not logically
imply Vx p(x), in general.
Our next example brings out the fact that the quantification of an open statement may
not be as explicit as we might prefer.
a) Let us consider the universe of all real numbers and examine the sentences:
EXAMPLE 2.37
1) If a number is rational, then it is a real number.
2) If x is rational, then x is real.
90 Chapter 2 Fundamentals of Logic
We should agree that these sentences convey the same information. But we should
also question whether the sentences are statements or open statements. In the case
of sentence (2) we at least have the presence of the variable x. But neither sentence
contains an expression such as “For all,” or “For every,” or “For each.” Our one and
only clue to indicate that we are dealing with universally quantified statements here is
the presence of the indefinite article ‘‘a” in the first sentence. In situations like these
the use of the universal quantifier is implicit as opposed to explicit.
If we let p(x), g(x) be the open statements
p(x): x 1s arational number q(x): x is areal number,
then we must recognize the fact that both of the given sentences are somewhat informal
ways of expressing the quantified statement
Vx [p(x) > q(x)].
b) For the universe of all triangles in the plane, the sentence
“An equilateral triangle has three angles of 60°, and conversely.”
provides another instance of implicit quantification. Here the indefinite article “An” is
the only indication that we might be able to express this sentence as a statement with
a universal quantifier. If the open statements
e(t): Triangle ¢ is equilateral.
a(t): Triangle t has three angles of 60°.
are defined for this universe, then the given sentence can be written in the explicit
quantified form
Vt [e(t) < a(t)].
c) In the typical trigonometry textbook one often comes across the trigonometric identity
sin?x + cos’ x = 1.
This identify contains no explicit quantification, and the reader must understand or be
told that it is defined for all real numbers x. When the universe of all real numbers is
specified (or at least understood), then the identity can be expressed by the (explicitly)
quantified statement
Vx [sin x + cos? x = 1].
d) Finally, consider the universe of all positive integers and the sentence
“The integer 41 is equal to the sum of two perfect squares.”
Here we have one more example where the quantification is implicit — but this
time the quantification is existential. We may express the result here in a more formal
(and symbolic) manner as
dm dn [41 = m? +n7].
The next example demonstrates that the truth value of a quantified statement may depend
on the universe prescribed.
2.4 The Use of Quantifiers 91
EXAMPLE 2.38 | Consider the open statement p(x): x? > 1.
1) If the universe consists of all positive integers, then the quantified statement Vx p(x)
is true.
2) For the universe of all positive real numbers, however, the same quantified state-
ment Vx p(x) is false. The positive real number 1/2 provides one of many possible
counterexamples.
Yet for either universe, the quantified statement 4x p(x) is true.
One use of quantifiers in a computer science setting is illustrated in the following
example.
In the following program segment, n is an integer variable and the variable A is an array
EXAMPLE 2.39
A[1], A[2], ..., A[20] of 20 integer values.
forn :=1
to 20 do
A[n] :=n*n-n
The following statements about the array A can be represented in quantified form, where
the universe consists of all integers from 1 to 20, inclusive.
1) Every entry in the array is nonnegative:
Vn (A[n] > 0).
2) There exist two consecutive entries in A where the larger entry is twice the smaller:
da (A[n + 1] = 2A[n]).
3) The entries in the array are sorted in (strictly) ascending order:
Wn [1 <n < 19) = (A[n] < A[n + 1))].
Our last statement requires the use of two integer variables m, n.
4) The entries in the array are distinct:
Vin Wn [(m #n) > (A[m] # A[n])], — or
Vm,n[(m <n) > (A[m] # A[n])].
Before continuing, we summarize and somewhat extend, in Table 2.21, what we have
learned about quantifiers.
The results in Table 2.21 may appear to involve only one open statement. However, we
should realize that the open statement p(x) in the table may stand for a conjunction of open
statements, such as g(x) A r(x), or an implication of open statements, such as s(x) > f(x).
If, for example, we want to know when the statement 4x [s(x) — t(x)] is true, then we
look at the table for dx p(x) and use the information provided there. The table tells us that
Ax (s(x) > t(x)] is true when s(a) — f(a) is true for some (at least one) a in the prescribed
universe.
We will look further into quantified statements involving more than one open statement.
Before doing so, however, we need to examine the following definition. This definition is
comparable to Definitions 2.2 and 2.4 where we defined the ideas of logically equivalent
statements and logical implication. It settles the same types of questions for open statements.
92 Chapter 2 Fundamentals of Logic
Table 2.21
Statement When Is It True? When Is It False?
dx p(x) For some (at least one) a in For every a in the universe,
the universe, p(a) is true. p(a) is false.
Vx p(x) For every replacement a from There is at least one replacement
the universe, p(qa) is true. a from the universe for which
p(a) is false.
dx ap(x) For at least one choice a in For every replacement a in the
the universe, p(a) is false, so universe, p(a) is true.
its negation —p(a) is true.
Vx ap(x) For every replacement a from There is at least one replacement
the universe, p(a@) is false and a from the universe for which
its negation —p(q) is true. —p(a) is false and p(a) is true.
Definition 2.6 Let p(x), g(x) be open statements defined for a given universe.
The open statements p(x) and g(x) are called (logically) equivalent, and we write
Vx [p(x) <> q(x)] when the biconditional p(a) — g(a) is true for each replacement a
from the universe (that is, p(a) <> q(a) for each a in the universe). If the implication
p(a) — q(a) is true for each a in the universe (that is, p(a) > g(a) for each a in the
universe), then we write Vx [p(x) => q(x)] and say that p(x) logically implies q(x).
For the universe of all triangles in the plane, let p(x), g(x) denote the open statements
p(x): x 1s equiangular q(x): x is equilateral.
Then for every particular triangle a (a replacement for x) we know that p(a) < q(a) is true
(that is, p(a) <> q(a), for every triangle in the plane). Consequently, Vx [p(x) <> g(x)].
Observe that here and, in general, Vx [p(x) <> q(x)] if and only if Vx [p(x) > q(x)]
and Vx [g(x) > p(x)].
We also realize that a definition similar to Definition 2.6 can be given for two open
statements that involve two or more variables.
Now we take another look at the logical equivalence of statements (not open state-
ments) as we examine the converse, inverse, and contrapositive of a statement of the form
Vx [p(x) > q(x)].
Definition 2.7 For open statements p(x), g(x) — defined for a prescribed universe — and the universally
quantified statement Vx [ p(x) > q(x)], we define:
1) The contrapositive of Vx [p(x) > q(x)] to be Vx [-=¢g(x) ~ —p(x)].
2) The converse of Vx [p(x) — q(x)] to be Vx [g(x) > p(x)].
3) The inverse of Vx [ p(x) > q(x)] to be Vx [> p(x) > -g(x)].
The following two examples illustrate Definition 2.7.
2.4 The Use of Quantifiers 93
For the universe of all quadrilaterals in the plane let s(x) and e(x) denote the open statements
EXAMPLE 2.40
s(x): x 18s a Square e(x): x is equilateral.
a) The statement
Vx [s(x) > e(x)]
is a true statement and is logically equivalent to its contrapositive
Vx [-e(x) > 75(x)]
because [s(a) > e(a)] <>} [-e(a) > —s(a)] for each replacement a. Hence
Vx [s(x) > e(x)] <= Vx [me(x) > a5(x)].
b) The statement
Vx [e(x) > s(x)]
is a false statement and is the converse of the true statement
Vx [s(x) > e(x)].
The false statement
Vx [>s(x) > -e(x)]
is the inverse of the given statement Wx [s(x) > e(x)].
Since [e(a) > s(a)] <> [—s(a) — -e(a)] for each specific quadrilateral a, we
find that the converse and inverse are logically equivalent — that is,
Vx [e(x) > s(x)] <> Vx [As (x) > me(x)].
Here p(x) and g(x) are the open statements
EXAMPLE 2.41
p(x): |x| >3 q(x): x >3
and the universe consists of all real numbers.
a) The statement Vx [p(x) — qg(x)] is a false statement. For example, if x = —5, then
p(—5) is true while g(—5) is false. Consequently, p(—5) > qg(—5) is false, and so
is Vx [p(x) > q(x)].
b) We can express the converse of the given statement [in part (a)] as follows:
Every real number greater than 3 has magnitude
(or, absolute value) greater than 3.
In symbolic form this true statement is written Vx [g(x) > p(x)].
c) The inverse of the given statement is also a true statement. In symbolic form we have
Vx [=p(x) — —q(x)], which can be expressed in words by
If the magnitude of a real number is less than or equal to 3,
then the number itself is less than or equal to 3.
And this is logically equivalent to the (converse) statement given in part (b).
d) Here the contrapositive of the statement in part (a) is given by Vx [~q(x) > —p(x)].
This false statement is logically equivalent to Vx [p(x) — q(x)] and can be expressed
94 Chapter 2 Fundamentals of Logic
as follows:
If a real number is less than or equal to 3, then so is its magnitude.
e) Together with p(x) and g(x) as above, consider the open statement
r(x): x <-—3,
which is also defined for the universe of all real numbers. The following four state-
ments are all true:
Statement: Vx [p(x) > (r(x) V g(x))]
Contrapositive: Wx [>(r(x) V g(x)) > >p(x)]
Converse: Vx [(r(x) V g(x) > p(®)]
Inverse: Vx [=p(x) > 7A(r(x) V g(x))]
In this case (because the statement and its converse are both true) we find that the
statement Vx [p(x) = (r(x) V g(x))] is true.
Now we use the results of Table 2.21 once again as we examine the next example.
Here the universe consists of all the integers, and the open statements r(x), s(x) are
EXAMPLE 2.42
given by
r(x): 2x+1=5 s(x): x? =9.
We see that the statement Ax [7 (x) A s(x)] is false because there is no one integer a such
that 2a + 1 = 5 and a* = 9. However, there is an integer b (= 2) such that 2b4 1 =5,
and there is a second integer c (= 3 or --3) such that c* = 9. Therefore the statement
dx r(x) A Ax s(x) is true. Consequently, the existential quantifier dx does not distribute
over the logical connective A. This one counterexample is enough to show that
dx [r(x) A s(x)] + [Ax r(x) A Ax s(x)],
where <# is read “is not logically equivalent to.” It also demonstrates that
[dx r(x) A Ax s(x)} A Ax [r(x) A s(x)],
where # is read “does not logically imply.” So the statement
[Ax r(x) A dx s(x)] > Ax [r(x) Asx]
is not a tautology.
What, however, can we say about the converse of a quantified statement of this form?
At this point we present a general argument for any (arbitrary) open statements p(x), g(x)
and any (arbitrary) prescribed universe.
Examining the statement
dx [p(x) Aqg(x)] > [dx p(x) A Ax q(x),
we find that when the hypothesis 4x [p(x) A q(x)] is true, there is at least one element c
in the universe for which the statement p(c) A q(c) is true. By the Rule of Conjunctive
Simplification (see Section 2.3), [p(c) A g(c)] => p(c). From the truth of p(c) we have the
true statement dx p(x). Similarly we obtain 4x g(x), another true statement. So dx p(x) A
2.4 The Use of Quantifiers 95
dx g(x) is a true statement. Since dx p(x) A dx g(x) is true whenever Ax [ p(x) A g(x)]
is true, it follows that
dx [p(x) A q(x)] => [Ax p(x) A Ax q(x)].
Arguments similar to the one for Example 2.42 provide the logical equivalences and
logical implications listed in Table 2.22. In addition to those listed in Table 2.22 many other
logical equivalences and logical implications can be derived.
Table 2.22 Logical Equivalences and Logical Implications for Quantified Statements in One
Variable
For a prescribed universe and any open statements p(x), g(x) in the variable x:
Ax [p(x) A q(x)] = [Ax px) A Ax qQ@)]
Ax [p(x) V q(x)] <> [Ax p(x) Vv Ax q(x)]
Vx [p(x) A q(x)] = [Vx p(x) A Vx gQ)]
[Vx p(x) V Wx q(x) => Wx [p@) Vv q(x)]
Our next example lists several of these and demonstrates how two of them are verified.
Let p(x), g(x), and r(x) denote open statements for a given universe. We find the following
EXAMPLE 2.43 logical equivalences. (Many more are also possible.)
1) Vx [p(x) A (¢@) Ar(x))] = Vx [(p(®) A g(x) Ar(x)]
To show that this statement is a logical equivalence we proceed as follows:
For each a in the universe, consider the statements p({a) A (q(a) Ar(a)) and
(p(a) A g{a)) A r(a). By the Associative Law for A, we have
pla) A (qa) Ar(a)) = (pla) Aqg{a)) Arta).
Consequently, for the open statements p(x) A (q(x) Ar(x)) and
(p(x) Ag(x)) A r(x), it follows that
Wx [p(x) A (q(x) Ar(x))] = Vx (Cp) Ag) Ar@)).
2) Ax [p(x) > q(Qx)] = Ax [>p(x) V g(x)]
For each c in the universe, it follows from Example 2.7 that
[p(c) > q(c)] = Imp) V go).
Therefore the statement 4x [p(x) > g(x)] is true (respectively, false) if and only if
the statement 4x [=p(x) V q(x)] is true (respectively, false), so
dx [p(x) > q(x)] = Ax [=p() Vv q(x)].
3) Other logical equivalences that we shall often find useful include the following.
a) Vx ~o p(x) & Vx p(x)
b) Wx -[p(x) A q(x)] <> Vx [p(x) V mg (x)]
c) Wx —[p(x) Vv q(x)] <> Vx [s=p() A -q(x)]
96 Chapter 2 Fundamentals of Logic
4) The results for the logical equivalences in 3(a), (b), and (c) remain valid when all of
the universal quantifiers are replaced by existential quantifiers.
The results of Tables 2.21 and 2.22 and Examples 2.42 and 2.43 will now help us with
a very important concept. How do we negate quantified statements that involve a single
variable?
Consider the statement Vx p(x). Its negation —namely, —[Vx p{x)]—can be stated
as “It is not the case that for all x, p(x) holds.” This is not a very useful remark, so we
consider —[Vx p(x)] further. When —[Vx p(x)] is true, then Vx p(x) 1s false, and so for
some replacement a from the universe —p(a) is true and dx —p(x) is true. Conversely,
whenever the statement 4x —p(x) is true we know that —p(b) is true for some member b of
the universe. Hence Vx p(x) is false and —[Vx p(x)] is true. So the statement —[Vx p(x)]
is true if and only if the statement 4x — p(x) is true. (Similar considerations also tell us that
—=[Vx p(x)] is false if and only if dx — p(x) is false.)
These observations lead to the following rule for negating the statement Vx p(x):
[Vx p(x)] <=> Ax —p(x).
In a similar way, Table 2.21 shows us that the statement 4x p(x) is true (false) precisely
when the statement Vx —p(x) is false (true). This observation then motivates a rule for
negating the statement 4x p(x):
[Ax p(x)] <> We ap(x).
These two rules for negation, and two others that follow from them, are given in Table 2.23
for convenient reference.
Table 2.23 Rules for Negating Statements with One Quanti-
fier
[Wx p(x)] <=> dx apr)
[Ax p(x)] <= Vx =p(x)
[Vx > p(x)] <= Ax -> p(x) <= Ax p(x)
—[dx sp(x)] <= Ve -~ p(x) <& Vx p(x)
We use the rules for negating quantified statements in the following example.
Here we find the negation of two statements, where the universe comprises all of the integers.
EXAMPLE 2.44
1) Let p(x) and q(x) be given by
p(x): x isodd g(x): x? — 1is even.
The statement “If x is odd, then x?—1 is even” can be symbolized as
Vx [ p(x) > q(x)]. (This is a true statement.)
The negation of this statement is determined as follows:
—[Vx (p(x) > q(x))] = Ax [-(p(x) > g(x))]
<> Ax [>(—p(x) V g(x))] <> Ax [=> p(x) A mq (x)]
<> dx [p(x) A-¢(x)]
In words, the negation says, “There exists an integer x such that x is odd and
x? — 1] is odd (that is, not even).” (This statement is false.)
2.4 The Use of Quantifiers 97
2) As in Example 2.42, let r(x) and s(x) be the open statements
r(x): 2x+1=5 S(x): x? =9,
The quantified statement dx [r(x) A s(x)] is false because it asserts the existence
of at least one integer a such that 2a + 1 = 5 (a = 2) and a* = 9 (a = 3 or —3).
Consequently, its negation
[Ax (r(x) A s(x))] <> Ve [A () A s(x))] <> Ve [ar (x) V ms (x)]
is true. This negation may be given in words as “For every integer x, 2x +- 1 #5 or
x 2 FY, 8
Because a mathematical statement may involve more than one quantifier, we continue
this section by offering some examples and making some observations on these types of
statements.
Here we have two real variables x, y, so the universe consists of all real numbers. The
EXAMPLE 2.45
commutative law for the addition of real numbers may be expressed by
Vx Vy (x+y =ytx).
This statement may also be given as
Vy Vx (x+y =y+Xx).
Likewise, in the case of the multiplication of real numbers, we may write
Vx Vy (xy = yx) or Vy Vx (xy = yx).
These two examples suggest the following general result. If p(x, y) is an open statement
in the two variables x, y (with either a prescribed universe for both x and y or one prescribed
universe for x and a second for y), then the statements Vx Vy p(x, y) and Vy Wx p(x, y)
are logically equivalent — that is, the statement Vx Vy p(x, y) is true (respectively, false)
if and only if the statement Vy Vx p(x, y) is true (respectively, false). Hence
Vx Vy p(x, y) <> Vy Ve p(x, y).
When dealing with the associative law for the addition of real numbers, we find that for all
EXAMPLE 2.46
real numbers x, y, and z,
xt(y+z)=(+y)+z.
Using universal quantifiers (with the universe of all real numbers), we may express this by
Ve Vy Ve[xt+(vtz2)=@+y)4+2z) or Vy We Ve [x + (942) = (4+ y) +2].
In fact, there are 3! = 6 ways to order these three universal quantifiers, and all six of these
quantified statements are logically equivalent to one another.
This is actually true for all open statements p(x, y, z), and to shorten the notation, one
may write, for example,
Vx, v,z p(x. y,z) <=> Vy, x, 2 p(x, y, 2) = Vx, zy p(x, y, 2),
describing the logical equivalence for three of the six statements.
98 Chapter 2 Fundamentals of Logic
In Examples 2.45 and 2.46 we encountered quantified statements with two and three
bound variables — each such variable bound by a universal quantifier. Our next example
examines a situation in which there are two bound variables — and this time each of these
variables is bound by an existential quantifier.
For the universe of all integers, consider the true statement “There exist integers x, y such
EXAMPLE 2.47
that x + y = 6.” We may represent this in symbolic form by
dx dy (x + y = 6).
If we let p(x, y) denote the open statement “x + y = 6,” then an equivalent statement can
be given by dy Ax p(x, y).
In general, for any open statement p(x, y) and universe(s) prescribed for the vari-
ables x, y,
dx Ay p(x, y) <=> Ay Ax pt, y).
Similar results follow for statements involving three or more such quantifiers.
When a statement involves both existential and universal quantifiers, however, we must
be careful about the order in which the quantifiers are written. Example 2.48 illustrates this
case.
We restrict ourselves here to the universe of all integers and let p(x, y) denote the open
EXAMPLE 2.48
statement “x + y = 17.”
1) The statement
Vx dy p(x, y)
says that “For every integer x, there exists an integer y such that x + y = 17.” (We
read the quantifiers from left to right.)
This statement is true; once we select any x, the integer y = 17 — x does exist
and x + y= x+(17 — x) = 17. But we realize that each value of x gives rise to a
different value of y.
2) Now consider the statement
dy Vx p(x, y).
This statement is read “There exists an integer y so that for all integers x, x + y =
17.” This statement is false. Once an integer y is selected, the only value that x can
have (and still satisfy x + y = 17)is 17 — y.
If the statement Jy Vx p(x, y) were true, then every integer (x) would equal
17 — y (for some one fixed y). This says, in effect, that all integers are equal!
Consequently, the statements Vx dy p(x, y) and dy Vx p(x, y) are generally not
logically equivalent.
Translating mathematical statements — be they postulates, definitions, or theorems —
into symbolic form can be helpful for two important reasons.
1) Doing so forces us to be very careful and precise about the meanings of statements,
the meanings of phrases such as “For all x” and “There exists an x,” and the order in
which such phrases appear.
2.4 The Use of Quantifiers 99
2) After we translate a mathematical statement into symbolic form, the rules we have
learned should then apply when we want to determine such related statements as the
negation or, if appropriate, the contrapositive, converse, or inverse.
Our last two examples illustrate this, and in so doing, extend the results in Table 2.23.
Let p(x, y), g(x, y), and r(x, y) represent three open statements, with replacements for
EXAMPLE 2.49
the variables x, y chosen from some prescribed universe(s). What is the negation of the
following statement?
Vx dy [(p(x, y) A g(x, y)) > r(x, y)]
We find that
—[Wx dy (pt, y) Ag(x, y)) > rx, y)]I
<> dx [-dy (Cp, y) Ag, y) > rG, y)]]
=> dx Vy —[(p(, y) A q(x, y)) > r(x, y)]
= dx Vy “[-[p@, y) A(x, yI Vr, y)]
<> Ax Vy [--[ p(x, y) Ag, yA ar, y)]
<> dx Vy ((p@, y) Ag(x, y)) Avr, y)].
Now suppose that we are trying to establish the validity of an argument (or a mathematical
theorem) for which
Vx dy [((p(@, y) Ags, y)) > r(x, y)]
is the conclusion. Should we want to try to prove the result by the method of Proof by
Contradiction, we would assume as an additional premise the negation of this conclusion.
Consequently, our additional premise would be the statement
dx Vy [(p(x, y) A g(x, y)) Amr(x, y)].
Finally, we consider how to negate the definition of limit, a fundamental concept in
calculus.
In calculus, one studies the properties of real-valued functions of a real variable. (Functions
EXAMPLE 2.50
will be examined in Chapter 5 of this text.) Among these properties is the existence of limits,
and one finds the following definition: Let J be an open interval’ containing the real number
a and suppose the function f is defined throughout J, except possibly at a. We say that f
has the limit L as x approaches a, and write lim,_,, f(x) = L, if (and only if) for every
€ > Othere exists ad > Oso that, forallx in 7,(O < |x — a| < 5) — (| f(x) — L| < €). This
can be expressed in symbolic form as
lim f(x) =L<esVWe >0 4d >0 Vx [(0 < |x —al <8) > ([ f(x) —L| <€)].
‘The concept of an open interval is defined at the end of Section 3.1.
100 Chapter 2 Fundamentals of Logic
[Here the universe comprises the real numbers in the open interval /, except possibly a.
Also, the quantifiers Ve > 0 and 35 > 0 now contain some restrictive information.] Then,
to negate this definition, we do the following (in which certain steps have been combined):
lim f(x) #L
<> [Ve > 0 A6>0 Vx [(0< |x — a] < 8)> ([f(x) — L|
< €)]]
<> de>0 V5 >0O Ax -[(0
< |x —a| <8) > (f(x)
— L| <€)]
= de>0 VS >0 Ax -[-(0
< |x —a| < 8) Vv (| f(x) — L| < €)]
<> de >0 V5 > 0 Ax [—--(0
< |x —a| <8) A> f(x) — L| < )]
<=> de>0 VS >0 Ax [(0< |x —al < dS) A (f(x)
—L| > )]
Translating into words, we find that lim,_., f(x) # L if (and only if) there exists a
positive (real) number € such that for every positive (real) number 4, there is an x in J such
that0 < |x — a| < 6 (that is, x # a and its distance froma is less than 8) but | f(x) — L| >€
[that is, the value of f(x) differs from L by at least €].
ee i(x): x is an isosceles triangle
a cs
se p(x): x has an interior angle that exceeds 180°
1. Let p(x), g(x) denote the following open statements. q(x): x is a quadrilateral
p(x): x <3 q(x): x+1isodd r(x): x is a rectangle
If the universe consists of all integers, what are the truth values s(x): x is a square
of the following statements? t(x): x is a triangle
a) q(1) b) ~p(3) ce) pT) V aq?) Translate each of the following statements into an English sen-
d) p(3) Aq(4) e) -(p(—4) v q(—3)) tence, and determine whether the statement is true or false.
f) -p(—4) A >q(-3) a) Wx [q(x) ¥ t(x)] b) Wx [i(x) > e(x)]
2. Let p(x), g(x) be defined as in Exercise 1. Let r(x) be the c) Ax [t(x) A p@®)] d) Vx [(a(x) A t(x)) © e(x)]
open statement “x > 0.” Once again the universe comprises all e) Ax [g(x) Arr(x)] f) Ax [r(x) A -s(x)]
integers. g) Vx A(x) > e(x)) sh) Wx [2(x) > 7px]
a) Determine the truth values of the following statements. i) Wx [s(x) © (a(x) A A(X))]
i) p(3) Vv [g(3) Vv mr (3)]
i) p2) > [gQ) > rQ] J) Vx [t(4) > (a(x) @ A(x))]
iii) [p(2) Aq(2)] > r(2) 5. Professor Carlson’s class in mechanics is comprised of 29
iv) p(0) > [-9(-1) & r(1)] students of which exactly
b) Determine all values of x for which 1) three physics majors are juniors;
[p(x) A q(x)] A r(x) results in a true statement. 2) two electrical engineering majors are juniors;
3. Let p(x) be the open statement “x? = 2x,” where the 3) four mathematics majors are juniors;
universe comprises all integers. Determine whether each of
4) twelve physics majors are seniors;
the following statements is true or false.
5) four electrical engineering majors are seniors;
a) p(0) b) p() ¢) p(2) 6) lectrical envineeri ; d d
d) p(—2) e) Ax p(x) f) Wx p(x) i two electrical engineering
Binecting majors
may are graduate students;
4. Consider the universe of all polygons with three or four ;
. . . . 7) two mathematics majors are graduate students.
sides, and define the following open statements for this uni-
verse. Consider the following open statements.
a(x): all interior angles of x are equal c(x): Student x is in the class (that is,
e(x): x is an equilateral triangle Professor Carlson’s mechanics class
h(x): all sides of x are equal as already described).
2.4 The Use of Quantifiers 101
j(x): Student x is a junior.
i) Wx [r(x) > p(x)] ii) Vx [sQx) > qQ@)I
s(x): Student x is a senior. iii) Vx [s(x) > 7t(x)) iv) Ax [s(x) Aamr(x)]
g(x): Student x is a graduate student. d) Provide a counterexample for each false statement in
part (c).
p(x): Student x is a physics major.
8. Let p(x), q(x), and r(x) denote the following open
e(x): Student x is an electrical engineering major.
statements.
m(x): Student x is a mathematics major.
p(x): x? -8x+15=0
Write each of the following statements in terms of quantifiers
and the open statements c(x), j(x), s(x), g(x), p(x), e(x), and q(x): x is odd
m(x), and determine whether the given statement is true or false. r(x): x>0
Here the universe comprises all of the 12,500 students enrolled
at the university where Professor Carlson teaches. Furthermore, For the universe of all integers, determine the truth or falsity of
at this university each student has only one major. each of the following statements. If a statement is false, give a
counterexample.
a) There is amathematics major in the class who is a junior.
a) Wx [p(x) > g(x)} b) Vx [g(x) > p(x)]
b) There is a senior in the class who is not a mathematics
major. c) dx [p(x) > g(x)J d) Ax [¢g(x) > p(x)]
c) Every student in the class is majoring in mathematics or e) Ax [r(x) > p(x)] f) Vx [-¢(x) > ap(x)]
physics. g) dx [p() > (g(x) Ar@))]
d) No graduate student in the class is a physics major. h) Vx [(p(x) V q(x) > r(x)]
e) Every senior in the class is majoring in either physics or 9. Let p(x), g(x), and r(x) be the following open statements.
electrical engineering.
p(x): x? —7x+10=0
6. Let p(x, y), g(x, y) denote the following open statements.
xw>y q(x): x? —2x —3=0
P(X, y): g(x,y) x+2<y
r(x): x <0
If the universe for each of x, y consists of all real numbers,
determine the truth value for each of the following statements. a) Determine the truth or falsity of the following state-
ments, where the universe is all integers. If a statement is
a) p(2, 4) b) g(, z)
false, provide a counterexample or explanation.
i) Vx [p(x) > 7r(x)] li) Vx [q(x) > r(x)]
e) p(2, 2) > gl, 1) f) pl, 2) + >¢(1,
2) iii) Ax (g(x) > r(x) iv) Ax [p(x) > r(x)]
7. For the universe of all integers, let p(x), g(x), r(x), s(x),
b) Find the answers to part (a) when the universe consists
and t(x) be the following open statements.
of all positive integers.
p(x): x >O0
c) Find the answers to part (a) when the universe contains
q(x): x is even only the integers 2 and 5.
r(x): x is a perfect square 10. For the following program segment, m and n are integer
s(x): x is (exactly) divisible by 4 variables. The variable A is a two-dimensional array A[1, 1},
A[], 2],..., A[], 20],..., A[10, 1], ..., A[10, 20], with 10
t(x): x is (exactly) divisible by 5
rows (indexed from | to 10) and 20 columns (indexed from 1
a) Write the following statements in symbolic form. to 20).
i) At least one integer is even.
form :=1to10do
ii) There exists a positive integer that is even.
forn:=1to20do
iii) If x is even, then x is not divisible by 5.
Alm,n) :=m+3%*n
iv) No even integer is divisible by 5.
v) There exists an even integer divisible by 5.
vi) If x is even and x is a perfect square, then x is Write the following statements in symbolic form. (The universe
divisible by 4. for the variable m contains only the integers from | to 10 in-
clusive; for the universe consists of the integers from | to 20
b) Determine whether each of the six statements in
inclusive.)
part (a) is true or false. For each false statement, provide a
counterexample. a) All entries of A are positive.
c) Express each of the following symbolic representations b) All entries of A are positive and less than or equal to 70.
in words. c) Some of the entries of A exceed 60.
102 Chapter 2. Fundamentals of Logic
d) The entries in each row of A are sorted into (strictly) f) Vn [>p(x) > -q(n))
ascending order. g) Va [ p(n) is sufficient for g(n)]
e) The entries in each column of A are sorted into (strictly) 15. For each of the following pairs of statements determine
ascending order. whether the proposed negation is correct. If correct, determine
f) The entries in the first three rows of A are distinct. which is true: the original statement or the proposed negation.
11. Identify the bound variables and the free variables in each If the proposed negation is wrong, write a correct version of the
of the following expressions (or statements). In both cases the negation and then determine whether the original statement or
universe comprises all real numbers. your corrected version of the negation is true.
a) Vy Az [cos(x + y) = sin(z — x)] a) Statement: For all real numbers x, y, if x? > y*, then
x>y.
b) dx dy [x? — y* = 2]
Proposed negation: There exist real numbers x, y such that
12. a) Let p(x, y) denote the open statement “x divides y,” x? > y* butx < y.
where the universe for each of the variables x, y comprises
b) Statement: There exist real numbers x, y such thatx and
all integers. (In this context “divides” means “exactly di-
y are rational but x + y is irrational.
vides” or “divides evenly.”) Determine the truth value of
Proposed negation: For all real numbers x, y, if x + y is
each of the following statements; if a quantified statement
rational, then each of x, y is rational.
is false, provide an explanation or a counterexample.
c) Statement: For all real numbers x, if x is not 0, then x
i) p3, 7) ii) p(3, 27) has a multiplicative inverse.
iii) Vy pl, y) iv) Vx p(x, 9)
Proposed negation: There exists a nonzero real number that
v) Vx p(x, x) vi) Vy dx p(x, y)
does not have a multiplicative inverse.
vii) Ay Vx p(x, y)
d) Statement: There exist odd integers whose product is
viii) Vx Vy (p(x, y) A p(y. x)) > & = yD]
odd.
b) Determine which of the eight statements in part (a) will
Proposed negation: The product of any two odd integers is
change in truth value if the universe for each of the variables
odd.
x, y were restricted to just the positive integers.
16. Write the negation of each of the following statements as
c) Determine the truth value of each of the following state-
an English sentence — without symbolic notation. (Here the
ments. If the statement is false, provide an explanation or
universe consists of all the students at the university where
a counterexample. [The universe for each of x, y is as in
Professor Lenhart teaches.)
part (b).]
a) Every student in Professor Lenhart’s C++ class is
i) Vx Ay p(x, y) ii) Vy Ax p(x, y)
majoring in computer science or mathematics.
iii) Ax Vy p(x, y) iv) Ay Vx p(x, y)
b) At least one student in Professor Lenhart’s C++ class is
13. Suppose that p(x, y) is an open statement where the uni-
a history major.
verse for each of x, y consists of only three integers: 2, 3, 5.
Then the quantified statement Sy p(2, y) is logically equiva- 17. Write the negation of each of the following true statements.
lent to p(2, 2) Vv p(2, 3) V p@, 5). The quantified statement For parts (a) and (b) the universe consists of all integers; for
Ax Vy p(x, y) is logically equivalent to [p(2, 2) A p(2, 3) A parts (c) and (d) the universe comprises all real numbers.
P(2, 5)) Vv [p(3, 2) A p3, 3) A pG, SIV [PG, 2) A pO, 3) a) For all integers n, if n is not (exactly) divisible by 2,
A p(5, 5)]. Use conjunctions and/or disjunctions to express the then 7 is odd.
following statements without quantifiers.
b) If k, m,n are any integers where k — m and m — n are
a) Vx p(x,3) b) Ax Ay p(x, y) — e) Vy Ax pt, y) odd, then k — w is even.
14, Let p(7), g(n) represent the open statements c) If x is a real number where x? > 16, then x < —4 or
p(n): nis odd q(n): nis odd x > 4,
for the universe of all integers. Which of the following state- d) Forall real numbers
x, if |x — 3| < 7,then—4 < x < 10.
ments are logically equivalent to each other? 18. Negate and simplify each of the following.
a) If the square of an integer 1s odd, then the integer is odd. a) Ax [p(x) Vv q(x)] b) Vx [p(x) A 79 (x))
b) Wn [p(n) is necessary for g (7] c) Wx [p(x) > g(x)]
c) The square of an odd integer is odd. d) Ax [(p(x) V 4(x)) > r(x)]
d) There are some integers whose squares are odd. 19. For each of the following statements state the converse,
e) Given an integer whose square is odd, that integer is inverse, and contrapositive. Also determine the truth value for
likewise odd. each given statement, as well as the truth values for its converse,
2.5 Quantifiers, Definitions, and the Proofs of Theorems 103
inverse, and contrapositive. (Here “divides” means “exactly 0 + a = a for every real number a. This may be expressed in
divides.’’) symbolic form by
a) [The universe comprises all positive integers.] dz Vala+z=z+a=a}.
Ifm > n, then m2 > n?.
(We agree that the universe comprises all real numbers.)
b) [The universe comprises all integers.]
a) In conjunction with the existence of an additive iden-
Ifa > b, then a? > b?. tity is the existence of additive inverses. Write a quantified
c) [The universe comprises all integers.] statement that expresses “Every real number has an addi-
If m divides n and n divides p, then m divides p. tive inverse.”’ (Do not use the minus sign anywhere in your
d) [The universe consists of all real numbers. ]} statement.)
Vx [(x > 3) > (x? > 9)] b) Write a quantified statement dealing with the existence
e) [The universe consists of all real numbers. ]} of a multiplicative identity for the arithmetic of real num-
For all real numbers x, if x* + 4x — 21 > 0, then x > 3 or bers.
x<—7, c) Write a quantified statement covering the existence of
20. Rewrite each of the following statements in the if then form. multiplicative inverses for the nonzero real numbers. (Do
Then write the converse, inverse, and contrapositive of your im- not use the exponent —1 anywhere in your statement.)
plication. For each result in parts (a) and (c) give the truth value d) Do the results in parts (b) and (c) change in any way
for the implication and the truth values for its converse, inverse, when the universe is restricted to the integers?
and contrapositive. [In part (a) “divisibility” requires a remain-
24. Consider the quantified statement Vx dy [x + y = 17]. De-
der of 0.)
termine whether this statement is true or false for each of the
a) [The universe comprises all positive integers.] following universes: (a) the integers; (b) the positive integers;
Divisibility by 21 is a sufficient condition for divisibility (c) the integers for x, the positive integers for y; (d) the positive
by 7. integers for x, the integers for y.
b) [The universe comprises all snakes presently slithering
about the jungles of Asia.] 25. Let the universe for the variables in the following state-
Being a cobra is a sufficient condition for a snake to be ments consist of all real numbers. In each case negate and sim-
dangerous. plify the given statement.
c) [The universe consists of all complex numbers. ] a) Wx Vy [(x > y)> (x ~ y > 0)]
For every complex number z, z being real is necessary for b) Vx Vy [x < y)> dz (¥ <z<y)]
27 to be real.
c) Vx Vy [x] = ly) > (y = £x))
21. For the following statements the universe comprises all 26. In calculus the definition of the limit L of a sequence of
nonzero integers. Determine the truth value of each statement.
real numbers 7), 2, 73, .. . can be given as
a) Ax dy [xy = 1] b) Ax Vy [vy = 1]
lim r, =L
c) Vx Ay [xy = 1] NOX
if (and only if) for every € > 0 there exists a positive integer k
d) Ax Ay [(2x + y =5) A (x — 3y = —8)]
so that for all integers n, ifn > k then |r, - L| <e.
e) Ax Ay [3x — y =7) A (2x + 4y = 3)] In symbolic form this can be expressed as
22. Answer Exercise 21 for the universe of all nonzero real
lim r, = L< We > 0 5k >0 Wn [(n
> k) > |r,
-— L| <€].
numbers. noes
23. In the arithmetic of real numbers, there is a real num- Express lim r, # L in symbolic form,
ber, namely 0, called the identity of addition because a + 0 =
2.5
Quantifiers, Definitions, and the Proofs
of Theorems
In this section we shall combine some of the ideas we have already studied in the prior two
sections. Although Section 2.3 introduced rules and methods for establishing the validity
of an argument, unfortunately the arguments presented there seemed to have little to do
with anything mathematical. [The rare exceptions are in Example 2.23 and the erroneous
104 Chapter 2. Fundamentals of Logic
argument in part (b) of the material preceding Example 2.26.] Most of the arguments dealt
with certain individuals and predicaments they were either in or about to face.
But now that we have learned some of the properties of quantifiers and quantified state-
ments, we are better equipped to handle arguments that will help us to prove mathematical
theorems. Before dealing with theorems, however, we shall consider how mathematical
definitions are traditionally presented in scientific writing.
Following Example 2.3 in Section 2.1, the discussion concerned how an implication
might be used in place of a biconditional in everyday conversation. But in scientific writing,
it was noted, we should avoid any and all situations where an ambiguous interpretation
might come about — in particular, an implication should not be used when a biconditional
is intended. However, there is one major exception to that rule and it concerns the way that
mathematical definitions are traditionally presented in mathematics textbooks and other
scientific literature. Example 2.51 demonstrates this exception.
a) Let us start with the universe of all quadrilaterals in the plane and try to identify those
EXAMPLE 2.51
that are called rectangles.
One person might say that
“If a quadrilateral is a rectangle then it has four equal angles.”
Another individual might identify these special quadrilaterals by observing that
“If a quadrilateral has four equal angles, then it is a rectangle.”
(Here both people are making implicitly quantified statements, where the quantifier is
universal.)
Given the open statements
p(x): x isarectangle q(x): x has four equal angles,
we can express what the first person says as
Vx [p(x) > q(x)],
while for the second person we would write
Vx [q(x) > p()].
So which of the preceding (quantified) statements identifies or defines a rectangle?
Perhaps we feel that they both do. But how can that be, since one statement is the
converse of the other and, in general, the converse of an implication is not logically
equivalent to the implication.
Here the reader must consider what is intended — not just what each of the two
people has said, or the symbolic expressions we have written to represent these state-
ments. In this situation each person is using an implication with the meaning of a
biconditional. They are both intending (though not stating)
Vx [p(x) q(x)],
— that is, each is really telling us that
“A quadrilateral is a rectangle if and only if it has four equal angles.”
b) Within the universe of all integers we can distinguish the even integers by means of a
certain property and so we may define them as follows:
For every integer n we call n even if it is divisible by 2.
2.5 Quantifiers, Definitions, and the Proofs of Theorems 105
(By the expression “divisible by 2” we mean “exactly divisible by 2” — that is, there
is no remainder upon division of the dividend x by the divisor 2.)
If we consider the open statements
p(n): nis an even integer q(n): nis divisible by 2,
then it appears that the preceding definition may be written symbolically as
Vn [g(n) > p(n)].
After all, the given quantified statement (in the preceding definition) is an implication.
However, the situation here is similar to that given in part (a). What appears to be
stated is not what is intended. The intention is for the reader to interpret the given
definition as
Vn [q(n) > p(n)],
that is,
“For every integer n, we call n even if and only if n is divisible by 2.”
(Note that the open statement “n is divisible by 2” can also be expressed by the open
statement “n = 2k, for some integer k.” Don’t be misled here by the presence of
the quantifier “for some integer k” — for the expression 4k [n = 2k] is still an open
statement because n remains a free variable.)
So now we see how quantifiers may enter into the way we state mathematical defini-
tions — and that the traditional way in which such a definition appears is as an implication.
But beware and remember: It is only in definitions that an implication can be (mis)read and
correctly interpreted as a biconditional.
Note how we defined the limit concept in Example 2.50. There we wrote “if (and only
if )” since we wanted to let the reader know our intention. Now we are free to replace “if
(and only if)” by simply “if.”
Having settled our discussion on the nature of mathematical definitions, we continue
now with an investigation of arguments involving quantified statements.
Suppose that we start with the universe that comprises only the 13 integers 2, 4, 6, 8,...,
EXAMPLE 2.52
24, 26. Then we can establish the statement:
For all n (meaning n = 2, 4, 6,..., 26),
we can write n as the sum of at most three perfect squares.
The results in Table 2.24 provide a case-by-case verification showing the given (quanti-
fied) statement to be true. (We might call this statement a theorem.)
Table 2.24
2=141 10=9+1 20 = 16+4
4=4 12=4+4+444 22=94+9+4
6=4+14+1 144=9+4+1 24= 164+4+4
§=4+4 16 = 16 26 = 25+1
18 = 16+1+1
106 Chapter 2. Fundamentals of Logic
This exhaustive listing is an example of a proof using the technique we call, rather
appropriately, the method of exhaustion. This method is reasonable when we are dealing
with a fairly small universe. If we are confronted with a situation in which the universe
is larger but within the range of a computer that is available to us, then we might write a
program to check all of the individual cases.
(Note that for certain cases in Table 2.24 more than one answer may be possible. For
example, we could have written 18 = 9 + 9 and 26 = 16 +9 + 1. But this is all right. We
were told that each positive even integer less than or equal to 26 could be written as the
sum of one, two, or three perfect squares. We were nor told that each such representation
had to be unique, so more than one possibility could occur. What we had to check in each
case was that there was at least one possibility.)
In the previous example we mentioned the word theorem. We also found this term used in
Chapter | — for example, in results like the binomial theorem and the multinomial theorem
where we were introduced to certain types of enumeration problems. Without getting overly
technical, we shall consider theorems to be statements of mathematical interest, statements
that are known to be true. Sometimes the term theorem is used only to describe major
results that have many and varied consequences. Certain of these consequences that follow
rather immediately from a theorem are termed corollaries (as in the case of Corollary 1.1
in Section 1.3). In this text, however, we shall not be so particular in our use of the word
theorem.
Example 2.52 is a nice starting point to examine the proof of a quantified statement.
Unfortunately, a great number of mathematical statements and theorems often deal with
universes that do not lend themselves to the method of exhaustion. When faced with es-
tablishing or proving a result for all integers, for example, or for all real numbers, then we
cannot use a case-by-case method like the one in Example 2.52. So what can we do?
We start by considering the following rule.
The Rule of Universal Specification: if an open statement becomes true for all
replacements by the members in a given universe, then that open statement is true for
each specific individual member in that universe. (A bit more symbolically — if p(x)
is an open statement for a given universe, and if Wx p(x) is true, then p(a) is true for
each a in the universe.)
This rule indicates that the truth of an open statement in one particular instance follows
(as a special case) from the more general (for the entire universe) truth of that universally
quantified open statement. The following examples will show us how to apply this idea.
a) For the universe of all people, consider the open statements
EXAMPLE 2.53
m(x): x is a mathematics professor c(x): x has studied calculus.
Now consider the following argument.
All mathematics professors have studied calculus.
Leona is a mathematics professor.
Therefore Leona has studied calculus.
2.5 Quantifiers, Definitions, and the Proofs of Theorems 107
If we let / represent this particular woman (in our universe) named Leona, then we
can rewrite this argument in symbolic form as
Vx [m(x) > c(x)]
mil)
Here the two statements above the line are the premises of the argument, and the
statement c(/) below the line is its conclusion. This is comparable to what we saw in
Section 2.3, except now we have a premise that is a universally quantified statement.
As was the case in Section 2.3, the premises are all assumed to be true and we must
try to establish that the conclusion is also true under these circumstances. Now, to
establish the validity of the given argument, we proceed as follows.
Steps Reasons
1) Vx [m(x) > c(x)] Premise
2) m(l) Premise
3) mil) > c(l) Step (1) and the Rule of Universal Specification
4) ..c(l) Steps (2) and (3) and the Rule of Detachment
Note that the statements in steps (2) and (3) are not quantified statements. They are
the types of statements we studied earlier in the chapter. In particular, we can apply
the rules of inference we learned in Section 2.3 to these two statements to deduce the
conclusion in step (4).
We see here that the Rule of Universal Specification enables us to take a universally
quantified premise and deduce from it an ordinary statement (that is, one that is not
quantified). This (ordinary) statement — namely, m(/) — c(/) —is one specific true
instance of the universally quantified true premise Wx [m(x) > c(x)].
b) For an example of a more mathematical nature let us consider the universe of all
triangles in the plane in conjunction with the open statements
p(t): t has two sides of equal length.
q(t): tis an isosceles triangle.
r(t): thas two angles of equal measure.
Let us also focus our attention on one specific triangle with no two angles of equal
measure. This triangle will be called triangle XYZ and will be designated by c. Then
we find that the argument
In triangle XYZ there is no pair of angles of equal
measure. —=r{(c)
If a triangle has two sides of equal length, then it is
isosceles. Vt [p(t) > ¢(t)]
If a triangle is isosceles, then it has two angles of equal
measure. Vt [q(t) > r(t)]
Therefore triangle XYZ has no two sides of equal length. apc)
is a valid one —as evidenced by the following.
108 Chapter 2. Fundamentals of Logic
Steps Reasons
1) Wt (p(t) > qt] Premise
2) plc) > g(c) Step (1) and the Rule of Universal Specification
3) Wt [g(t) > r(t)] Premise
4) g(c) > r(c) Step (3) and the Rule of Universal Specification
5) plc) > r{c) Steps (2) and (4) and the Law of the Syllogism
6) —=r(c) Premise
7) “. 4ptc) Steps (5) and (6) and Modus Tollens
Once again we see how the Rule of Universal Specification helps us. Here it has
taken the universally quantified statements at steps (1) and (3) and has provided us
with the (ordinary) statements at steps (2) and (4), respectively. Then at this point we
were able to apply the rules of inference we learned in Section 2.3 (namely, the Law
of the Syllogism and Modus Tollens) to derive the conclusion —p(c) in step (7).
c) Now for one last argument to drive the point home! Here we’ll consider the universe
to be made up of the entire student body at a particular college. One specific student,
Mary Gusberti, will be designated by m.
For this universe and the open statements
jJ(x): x is ajunior s(x): x is asenior
p(x): x is enrolled in a physical education class
we consider the argument:
No junior or senior is enrolled in a physical education class.
Mary Gusberti is enrolled in a physical education class.
Therefore Mary Gusberti is not a senior.
In symbolic form this argument becomes
Vx [(7(x) V s(x) > mp)]
p(m)
J“. as(m)
Now the following steps (and reasons) establish the validity of this argument.
Steps Reasons
1) Vx (Gi) V s(x) > ap(x)] Premise
2) p(m) Premise
3) (J(m) V s(m)) > —p(m) Step (1) and the Rule of Universal
Specification
4) p(m) > 7(j(m) V s(n)) Step (3), (¢ — t) <=> (-t > -q), and the
Law of Double Negation
5) p(m) + (7j(m) A -s(m)) Step (4) and DeMorgan’s Law
6) —j(m) A 7s(m) Steps (2) and (5) and the Rule of
Detachment (or Modus Ponens)
7) 7. -s5(m) Step (6) and the Rule of Conjunctive
Simplification
In Example 2.53 we have had our first opportunity to apply the Rule of Universal Speci-
fication. Using the rule in conjunction with Modus Ponens (or the Rule of Detachment) and
2.5 Quantifiers, Definitions, and the Proofs of Theorems 109
Modus Tollens, we are able to state the following corresponding analogs, each of which
involves a universally quantified premise. In either case we consider a fixed universe that
includes a specific member c and make use of the open statements p(x), g(x) defined for
this universe.
(I) Wx [p(x) > g(x)] (2) Wx [p(x) > ¢g(x)]
p(c) 74 (c)
“.qlc) J. aplc)
These two valid arguments are presented here for the same reason we presented them for the
rules of inference — Modus Ponens and Modus Tollens — in Section 2.3 (in the discussion
between Examples 2.25 and 2.26). We want to examine some possible errors that may arise
when the results in (1) and (2) are not used correctly.
Let us start with the universe of all polygons in the plane. Within this universe we shall
let c denote one specific polygon — the quadrilateral EF GH, where the measure of angle
E is 91°. For the open statements
p(x): x is a square q(x): x has four sides,
the following argument is invalid.
(1’) All squares have four sides.
Quadrilateral EF GAH has four sides.
Therefore quadrilateral EFGH is a square.
In symbolic form this argument translates into
(1”) Vx [p(x) > q(x)]
qc)
J. ple)
Unfortunately, although the premises are true, the conclusion is false. (For a square has no
angle of measure 91°.) We admit that there might be some confusion between this argument
and the valid one in (1) above. For when we apply the Rule of Universal Specification to
the quantified premise in (1”), in this instance we arrive at the invalid argument
p(c) > gc)
q(c)
. pc)
And here, as in Section 2.3, the error in reasoning lies in our attempt to argue by the converse.
A second invalid argument — from the misuse of argument (2) above —can also be
given, as shown in the following.
(2’) All squares have four sides.
Quadrilateral EF GH is not a square.
Therefore quadrilateral EFGH does not have four sides.
Translating (2’) into symbolic form results in
(2”) Vx [p(x) > ¢(x)]
TPC)
“7g (C)
Chapter 2 Fundamentals of Logic
This time the Rule of Universal Specification leads us to
p(c) > g(c)
apc)
J“. aq(c)
where the fallacy arises because we are trying to argue by the inverse.
And now let us look back at the three parts of Example 2.53. Although the arguments
presented there involved premises that were universally quantified statements, there was
never any instance where a universally quantified statement appeared in the conclusion. We
now want to remedy this situation, since many theorems in mathematics have the form of
a universally quantified statement. To do so we need the following considerations.
Start with a given universe and the open statement p(x). To establish the truth of the
statement Vx p(x), we must establish the truth of p(c) for each member c in the given
universe. But if the universe has many members or, for example, contains all the positive
integers, then this exhaustive, if not exhausting, task of validating the truth of each p(c)
becomes difficult, if not impossible. To get around this situation we shall prove that p(c)
is true — but now we do it for the case where c denotes a specific but arbitrarily chosen
member from the prescribed universe.
Should the preceding open statement p(x) have the form g(x) > r(x), for open state-
ments g(x) and r(x), then we shall assume the truth of g(c) as an additional premise and try
to deduce the truth of r(c) — by using definitions, axioms, previously proven theorems, and
the logical principles we have studied. For when g(c) is false, the implication g(c) > r(c)
is true, regardless of the truth value of r(c).
The reason that the element c must be arbitrary (or generic) is to make sure that what
we do and prove about c is applicable for all the other elements in the universe. If we are
dealing with the universe of all integers, for example, we cannot choose c in an arbitrary
manner by selecting c as 4, or by selecting c as an even integer. In general, we cannot
make any assumptions about our choice for c unless these assumptions are valid for all the
other elements of the universe. The word generic is applied to the element c here because it
indicates that our choice (for c) must share all of the common characteristics of the elements
for the given universe.
The principle we have described in the preceding three paragraphs is named and sum-
marized as follows.
The Rule of Universal Generalization: If an open statement p(x) is proved to be
true when x is replaced by any arbitrarily chesen element c from our universe, then the
universally quantified statement Vx p(x) is true. Furthermore, the rule extends beyond
a Single variable. So if, for example, we have an open statement q(x, y) that is proved
to be true when x and y are replaced by arbitrarily chosen elements from the same
universe, or their own respective universes, then the universally quantified statement
Wx Vy g(x, y) for, Vx, y g(x, y)] is true. Similar results hold for the cases of three or
more variables, |
Before we demonstrate the use of this rule in any examples, we wish to look back at
part (1) of Example 2.43 in Section 2.4. It turns out that the explanation given there to
establish that
Vx [p(x) A (q(x) Ar(x))] = Vx [(p(Qx) A g(x) Ar(x)]
2.5 Quantifiers, Definitions, and the Proofs of Theorems WI
anticipated what we have now described in detail as the Rules of Universal Specification
and Universal Generalization.
Now we’ll turn to an example which is strictly symbolic. This example provides an
opportunity to apply the Rule of Universal Generalization.
Let p(x), g(x), and r(x) be open statements that are defined for a given universe. We show
EXAMPLE 2.54
that the argument
Vx [p(x) > q(x)]
Vx [g(x) > r(x)]
“Wx [p(x) > r(x)]
is valid by considering the following.
Steps Reasons
1) Vx [(p(x) > g(x)] Premise
2) p(c) > q(c) Step (1) and the Rule of Universal Specification
3) Vx [g(x) > r(x)] Premise
4) g(c) > r(c) Step (3) and the Rule of Universal Specification
5) p(c) > r(c) Steps (2) and (4) and the Law of the Syllogism
6) «Vx [p(x) > r(x)] Step (5) and the Rule of Universal Generalization
Here the element c introduced in steps (2) and (4) is the same specific but arbitrarily
chosen element from the universe. Since this element c has no special or distinguishing
properties but does share all of the common features of every other element in this universe,
we can use the Rule of Universal Generalization to go from step (5) to step (6).
And so at last we have dealt with a valid argument where a universally quantified state-
ment appears as the conclusion, as well as among the premises.
The question that now may be at the back of the reader’s mind is one of practicality.
Namely, when would we ever need to use the argument that we validated in Example 2.54?
We may find that we have already used it (perhaps, unknowingly) in earlier algebra and
geometry courses, as we demonstrate in the following example.
a) For the universe of all real numbers, consider the open statements
| EXAMPLE 2.55 |
p(x): 3x —7=20 q(x): 3x =27 r(x): x =9,
The following solution of an algebraic equation parallels the valid argument from
Example 2.54.
1) If 3x —7 = 20, then 3x = 27. Vx [p(x) > ¢(x)]
2) If 3x = 27, then x = 9. Vx [g(x) > r(x)]
3) Therefore, if 3x — 7 = 20, then x = 9. 7 Wx [p(x) > r(x)]
b) When we dealt with the universe of all quadrilaterals in plane geometry, we may have
found ourselves relating something like this:
“Since every square is a rectangle, and every rectangle
is a parallelogram, it follows that every square is a parallelogram.”
In this case we are using the argument in Example 2.54 for the open statements
p(x): x is a square q(x): x is arectangle r(x): x 1s a parallelogram.
12 Chapter 2 Fundamentals of Logic
Now we continue with one more argument to validate.
The steps and reasons needed to establish the validity of the argument
EXAMPLE 2.56
Vx [p(x) V 4(x)]
Vx [(>p(x) A g(x)) > r(x)]
ox [ar(x) > p(x)]
are given as follows. [Here the element c is in the universe assigned for the argument. Also,
since the conclusion is a universally quantified implication, we can assume —r(c) as an
additional premise— as was mentioned earlier when the Rule of Universal Generalization
was first introduced.|
Steps Reasons
1) Vx [p(®*) V g(x) Premise
2) ple) V g(c) Step (1) and the Rule of Universal
Specification
3) Vx ((~p(x) Aqg(x)) > r()] Premise
4) [—p(c) Aqg(e)] > r(c) Step (3) and the Rule of Universal
Specification
5) -=r(c) > -[-p{c) A qg(c)] Step (4) ands > t <> -t > 77s
6) —r(c) > [p(c) V -q(c)] Step (5), DeMorgan’s Law, and the Law of
Double Negation
7) -r(c) Premise (assumed)
8) p(c) V 7q(c) Steps (7) and (6) and Modus Ponens
9) [p(c) V g(e)] A [p(e) V -¢(c)] Steps (2) and (8) and the Rule of Conjunction
10) pic) Vv [g(c) A -q{c)] Step (9) and the Distributive Law of V over A
11) p(c) Step (10), g(c) A mq(c) <> Fo, and
p(c) V Fo => plc)
12) «. Vx [=r(x) > p(x)] Steps (7) and (11) and the Rule of Universal
Generalization
Before going on we want to point out a convention that the reader may not like but
will have to get used to. It concerns our coverage of the Rules of Universal Specification
and Universal Generalization. In the first case we started with the statement Vx p(x) and
then dealt with p(c) for some specific element c in our universe. For the Rule of Universal
Generalization we obtained the truth of Vx p(x) from that of p(c), where c was arbitrarily
selected from the universe. Unfortunately, we'll often find ourselves using the letter x
instead of c to denote the element — but as long as we understand what is happening we
shall soon find the convention easy enough to work with.
The results of Example 2.54 and especially Example 2.56 lead us to believe that we can
use universally quantified statements and the rules of inference — including the Rules of
Universal Specification and Universal Generalization — to formalize and prove a variety of
arguments and, hopefully, theorems. When we do so it appears that the validation of some
rather short arguments requires quite a number of steps, because we have been very metic-
ulous and included all the steps and reasons — we left little, if anything, to the imagination.
The reader should rest assured that when we start to prove mathematical theorems, we shall
present the proofs in the more conventional paragraph style. We shall no longer mention
2.5 Quantifiers, Definitions, and the Proofs of Theorems 113
each and every application of the laws of logic and the other tautologies or the rules of
inference. On occasion we may single out a certain rule of inference, but our attention will
be primarily directed to the use of definitions, mathematical axioms and principles (other
than those we have found in our study of logic), and other (earlier) theorems we have been
able to prove. Why then have we been learning all of this material on validating arguments?
Because it will provide us with a framework to fall back on whenever we doubt whether
a given attempt at a proof really does the job. If in doubt, we have our study of logic to
supply us with a somewhat mechanical but strictly objective means to help us decide.
And now we present paragraph-style proofs for some results about the integers. (These
results may be considered rather obvious to us—in fact, we may find we have already
seen and used some of them. But they provide an excellent setting for writing some simple
proofs.) The proofs we shall presently introduce use the following ideas, which we now
formally define. [The first idea was mentioned earlier in part (b) of Example 2.51.]
Definition 2.8 Let n be an integer. We call n even if n is divisible by 2 — that is, if there exists an integer
r so that n = 2r. If n is not even, then we call n odd and find for this case that there exists
an integer s where n = 2s + 1.
THEOREM 2.2 For all integers k and J, if k, 1 are both odd, then k + / is even.
Proof: In this proof we shall number the steps so that we may refer to them for some later
remarks. After this we shall no longer number the steps.
1) Since k and / are odd, we may write k = 2a + 1 and / = 2b + 1, for some integers
a, b. This is due to Definition 2.8.
2) Then
k+l=(Qa+1)4+(2b4+1) =2(a+b4+)),
by virtue of the Commutative and Associative Laws of Addition and the Distributive
Law of Multiplication over Addition — all of which hold for integers.
3) Since a, b are integers, a + b + 1 = c is an integer; with k + / = 2c, it follows from
Definition 2.8 that k + / is even.
Remarks
1) In step (1) of the preceding proof k and? were chosen in an arbitrary manner, so we
know by the Rule of Universal Generalization that the result obtained is true for all
odd integers.
2) Although we may not realize it, we are using the Rule of Universal Specification
(twice) in step (1). The first argument implicit in this step reads as follows.
i) If is an odd integer, then n = 2r + | for some integer r.
ii) The integer k is a specific (but arbitrarily chosen) odd integer.
iii) Therefore we may write k = 2a + 1 for some (specific) integer a.
3) In step (1) we do not have k = 2a + 1 and / = 2a +1. Since k, / are arbitrarily
chosen, it may be the case that k = /— and when this happens we have 2a + 1 =
k =1 = 2b + 1, from which it follows thata = b. [Since k may not equal /, it follows
114 Chapter 2 Fundamentals of Logic
that (k — 1)/2 =a may not equal b = (J — 1)/2. Thus we should use the different
variables a and b.]
Before we proceed with another theorem — written in the more conventional manner —
let us examine the following.
Consider the following statement for the universe of integers.
) EXAMPLE 2.57
If n is an integer, then n? = n —or, Vn [n? = n].
Now for n = 0 it is true that n* = 0? = 0 = n. And ifn = 1, it is also true that n? = 1? =
1 = n. However, we cannot conclude n? = n for every integer n. The Rule of Universal
Generalization does not apply here, for we cannot consider the choice of 0 (or 1) as an
arbitrarily chosen integer. If n = 2, we have n* = 4 4 2 =n, and this one counterexample
is enough to tell us that the given statement is false. However, either replacement — namely,
n = 0 orn — | —is enough to establish the truth of the statement:
For some integer n, n 2 = n—or, dn [n? =n].
We close — at last — with three results to demonstrate how we shall write proofs through-
out the remainder of the text.
THEOREM 2.3 For all integers & and /, if k and? are both odd, then their product k/ is also odd.
Proof: Since k and / are both odd, we may write k = 2a + 1 and / = 2b + 1, for some
integers a and b —because of Definition 2.8. Then the product k/ = (2a + 1)(2b+ 1) =
4ab + 2a + 2b4 1 = 2(2ab+a+b) +1, where 2ab + a + bis an integer. Therefore, by
Definition 2.8 once again, it follows that ki is odd.
The preceding proof is an example of a direct proof. In our next example we shall prove
a result in three ways: first by a direct argument (or proof), then by the contrapositive
method, and finally by the method of proof by contradiction. [For the (method of) proof
by contradiction we put in some extra details, since this is our first opportunity to use this
technique.] The reader should not assume, however, that every theorem can be so readily
proved in a variety of ways.
THEOREM 2.4 If m is an even integer, then m + 7 is odd.
Proof:
1) Since m is even, we have m = 2a for some integer a. Then m +7 = 2a +7 =
2a+6+ 1 =2(a +3) + 1. Since a + 3 is an integer, we know that m + 7 is odd.
2) Suppose that m +-7 is not odd, hence even. Then m + 7 = 2b for some integer b
and m = 2b -7 = 2b-—8+1=2(b —4) +1, where b — 4 is an integer. Hence
m is odd. [The result follows because the statements Vin [p(m) —- q(m)] and
Vm[-g(m) > —p(m)] are logically equivalent.]
2.5 Quantifiers, Definitions, and the Proofs of Theorems 115
3) Now assume that m is even and that m +7 is also even. (This assumption is the
negation of what we want to prove.) Then m + 7 even implies that m + 7 = 2c for
some integer c. And, consequently, m = 2c — 7 = 2c —~8+1=2(c —4) +1 with
c — 4an integer, so m is odd. Now we have our contradiction. We started with m even
and deduced m odd — an impossible situation, since no integer can be both even and
odd. How did we arrive at this dilemma? Simple — we made a mistake! This mistake
is the false assumption — namely, m + 7 is even -—that we wanted to believe at the
start of the proof. Since the assumption is false, its negation is true, and so we now
have m + 7 odd.
The second and third proofs for Theorem 2.4 appear to be somewhat similar. This is
because the contradiction we derived in the third proof arises from the hypothesis of the
theorem and its negation. We shall see as we progress (as early as the next chapter) that a
contradiction may also be obtained by deriving the negation of a known fact —a fact that
is not the hypothesis of the theorem we are attempting to prove. For now, however, let us
think about this similarity a little more. Suppose we start with the open statements p(m)
and g(m)—for a prescribed universe — and consider a theorem of the form Vm [ p(m)
q(m)]. If we try to prove this result by the contrapositive method, then we shall actually
prove the logically equivalent statement Vin [—g(m) — —p(m)]. To do so we assume the
truth of —q(m) (for any specific but arbitrarily chosen m in the universe) and show that
this leads to the truth of —p(m). On the other hand, if we wish to prove the theorem
Vm [p(m) — q(m)] by the method of proof by contradiction, then we assume that the
statement Wm [p(m) — qg(m)] 1s false. This amounts to the fact that p(m) — q(m) is false
for at least one replacement for m from the universe — that is, there is some element m
in the universe for which p(m) is true and qg(m) is false [or ~g(m) is true]. We then use
the truth of p(m) and —g(m) to derive a contradiction. [In the third proof of Theorem 2.4
we obtained p(m) A —p(m).] These two methods can be compared symbolically in the
following — where m is specific but arbitrarily chosen for the method of contraposition.
Assumption Result Derived
Contraposition —q(m) —p(m)
Contradiction p(m) and -q(m) Fo
In general, when we are able to establish a theorem by either a direct proof or an indirect
proof, the direct approach is less cumbersome than an indirect approach. (This certainly
appears to be the case for the three proofs presented for Theorem 2.4.) When we do not
have any prescribed directions given for attempting the proof of a certain theorem, we might
Start with a direct approach. If we succeed, then all is well. If not, then we might consider
trying to find a counterexample to what we thought was a theorem. Should our search for
a counterexample fail, then we might consider an indirect approach. We might prove the
contrapositive of the theorem, or obtain a contradiction, as we did in the third proof of
Theorem 2.4, by assuming the truth of the hypothesis and the truth of the negation of the
conclusion (for some element m in the universe) in the given theorem.
We close this section with one more indirect proof by the method of contraposition.
THEOREM 2.5 For all positive real numbers x and y, if the product xy exceeds 25, then x > Sory>5.
Proof: Consider the negation of the conclusion— that is, suppose that 0 < x <5 and 0 <
y <5. Under these circumstances we find thatO = 0-0<x-y<5-5 = 25,so the product
116 Chapter 2 Fundamentals of Logic
xy does not exceed 25. (This indirect method of proof now establishes the given statement,
since we know that an implication is logically equivalent to its contrapositive.)
b) All law-abiding citizens pay their taxes.
Mr. Pelosi pays his taxes.
Therefore Mr. Pelosi is a law-abiding citizen.
1. In Example 2.52 why did we stop at 26 and not at 28?
c) All people who are concerned about the environment
2. In Example 2.52 why didn’t we include the odd integers recycle their plastic containers.
between 2 and 26? Margarita is not concerned about the environment.
3. Use the method of exhaustion to show that every even in- Therefore Margarita does not recycle her plastic containers.
teger between 30 and 58 (including 30 and 58) can be written 7. For a prescribed universe and any open statements p(x),
as a sum of at most three perfect squares. q(x) in the variable x, prove that
4, Let n be a positive integer greater than 1. We call n prime a) Sx [p(x) V g(x)] & Ax p(x) v Ax g(x)
if the only positive integers that (exactly) divide n are 1 and
b) Wx [p(x) A g(x)] <=> Vx px) A Wx q(x)
n itself. For example, the first seven primes are 2, 3, 5, 7, 11,
13, and 17. (We shall learn more about primes in Chapter 4.) 8. a) Let p(x), q(x) be open statements in the variable x, with
Use the method of exhaustion to show that every integer in the a given universe. Prove that
universe 4, 6, 8,..., 36, 38 can be written as the sum of two Vx p(x) V Wx g(x) => Wx [p@) Vv g(x)].
primes.
[That is, prove that when the statement Vx p(x) V Vx q(x)
5. For each of the following (universes and) pairs of state- is true, then the statement Vx [p(x) Vv qg(x)] is true.]
ments, use the Rule of Universal Specification, in conjunction
b) Find a counterexample for the converse in part (a). That
with Modus Ponens and Modus Tollens, in order to fill in the
is, find open statements p(x), g(x) and a universe such that
blank line so that a valid argument results.
Vx [p(x) V q(x)]is true, while Vx p(x) Vv Vx q(x) is false.
a) [The universe comprises all real numbers.]
9. Provide the reasons for the steps verifying the following
All integers are rational numbers.
argument. (Here a denotes a specific but arbitrarily chosen ele-
The real number 7 is not a rational number.
ment from the given universe.)
Vx [p(x) > (g(x) Ar(x))]
b) [The universe comprises the present population of the
Vx [p(x) A s(x)]
United States.]
All librarians know the Library of Congress Classification Ox Er Ox) A s(x}
System.
Steps Reasons
., Margaret knows the Library of Congress Classification 1) Vx [p(x) > (g(x) Ar(x))]
System. 2) Wx [p(x) A s(x)]
3) p(a) > (g(a) Ar(a))
c) [The same universe as in part (b).]
4) pla) As(a)
5) p(a)
Sondra is an administrative director.
6) g(a) Ar(a)
... Sondra knows how to delegate authority.
7) r(a)
d) [The universe consists of all quadrilaterals in the plane.] 8) s(a)
All rectangles are equiangular. 9) r(a) A s(a)
10) 7. Vx [r(x) A s(x]
., Quadrilateral MN PQ is not a rectangle.
10. Provide the missing reasons for the steps verifying the fol-
6. Determine which of the following arguments are valid and lowing argument:
which are invalid. Provide an explanation for each answer. (Let
the universe consist of all people presently residing in the United Vx [p(x) V q(x)]
States.) Ax sp(x)
Wx [>4(x) V r(x)]
a) All mail carriers carry a can of mace.
Vx [s(x) > ar @)]
Mrs. Bacon is a mail carrier,
dx as(x)
Therefore Mrs. Bacon carries a can of mace.
2.6 Summary and Historical Review 117
Steps Reasons 12, Give a direct proof (as in Theorem 2.3) for each of the
1) Vx [p(x) Vv g(x)] Premise following.
2) Sx >p(x) Premise
a) For all integers & and /, if k, / are both even, then k + /
3) -p(a) Step (2) and the definition of
is even.
the truth for dx — p(x). [Here
a is an element (replacement) b) For all integers k and /, if k, / are both even, then &/ is
from the universe for which even.
— p(x) is true.] The reason for 13. For each of the following statements provide an indirect
this step is also referred to as proof [as in part (2) of Theorem 2.4] by stating and proving the
the Rule of Existential contrapositive of the given statement.
Specification. a) For all integers k and /, if k/ is odd, then k, / are both
4) p(a)v q(a) odd.
5) q(a)
b) For all integers k and /, if k + / is even, then k and? are
6) Vx [-=¢(x) V r(x)]
both even or both odd.
7) —q(a) Vv r(a)
8) g(a) > r(a) 14. Prove that for every integer n, if n is odd, then n? is odd.
9) r(a) 15, Provide a proof by contradiction for the following: For
10) Vx [s(x) > -r(x)] every integer n, if n” is odd, then n is odd.
11) s(a) > -r(a) 16. Prove that for every integer n, n? is even if and only if n is
12) r(a) > -s(a) even.
13) -s(a)
17. Prove the following result in three ways (as in Theorem
14) .. Ax -5(x) Step (13) and the definition
of the truth for dx —s(x). The 2.4): Ifn is an odd integer, then n + 11 is even.
reason for this step is also 18. Let m, n be two positive integers. Prove that if m,n are
referred to as the Rule of perfect squares, then the product mn is also a perfect square.
Existential Generalization. 19, Prove or disprove: If m,n are positive integers and m, n
11. Write the following argument in symbolic form. Then either are perfect squares, then m + n is a perfect square.
verify the validity of the argument or explain why it is invalid. 20. Prove or disprove: There exist positive integers m, n,
[Assume here that the universe comprises all adults (18 or over) where m,n, and m + n are all perfect squares.
who are presently residing in the city of Las Cruces (in New 21. Prove that for all real numbers x and y, ifx + y > 100, then
Mexico). Two of these individuals are Roxe and Imogene.] x > 50 or y > 50.
All credit union employees must know COBOL. All credit
22, Prove that for every integer n, 4n + 7 is odd.
union employees who write loan applications must know Ex-
cel.’ Roxe works for the credit union, but she doesn’t know 23. Let n be an integer. Prove that n is odd if and only if 7n + 8
Excel. Imogene knows Excel but doesn’t know COBOL. There- is odd.
fore Roxe doesn’t write loan applications and Imogene doesn’t 24, Let n be an integer. Prove that n is even if and only if
work for the credit union. 31n + 12 is even.
2.6
Summary and Historical Review
This second chapter has introduced some of the fundamentals of logic — in particular, some
of the rules of inference and methods of proof necessary for establishing mathematical
theorems.
The first systematic study of logical reasoning is found in the work of the Greek philoso-
pher Aristotle (384-322 B.c.). In his treatises on logic Aristotle presented a collection of
principles for deductive reasoning. These principles were designed to provide a foundation
“The Excel spreadsheet is a product of Microsoft, Inc.
118 Chapter 2 Fundamentals of Logic
for the study of all branches of knowledge. In a modified form, this type of logic was taught
up to and throughout the Middle Ages.
Aristotle (384-322 8.c.)
The German mathematician Gottfried Wilhelm Leibniz (1646-1716) is often considered
the first scholar who seriously pursued the development of symbolic logic as a universal
scientific language. This he professed in his essay De Arte Combinatoria, published in 1666.
His research in the area of symbolic logic, carried out from 1679 to 1690, gave considerable
impetus to the creation of this mathematical discipline.
Following the work by Leibniz, little change took place until the nineteenth century, when
the English mathematician George Boole (1815-1864) created a system of mathematical
logic that he introduced in 1847 in the pamphlet The Mathematical Analysis of Logic,
Being an Essay Towards a Calculus of Deductive Reasoning. In the same year, Boole’s
countryman Augustus DeMorgan (1806-1871) published Formal Logic; or, the Calculus
of Inference, Necessary and Probable. In some ways this treatise extended Boole’s work
George Boole (1815-1864)
2.6 Summary and Historical Review 119
considerably. Then, in 1854, Boole detailed his ideas and further research in the notable
work An Investigation in the Laws of Thought, on Which Are Founded the Mathematical
Theories of Logic and Probability. The American logician Charles Sanders Peirce (1839-
1914), who was also an engineer and philosopher, introduced the formal concept of the
quantifier into the study of symbolic logic.
The concepts formulated by Boole were thoroughly examined in the work of another
German scholar, Ernst Schréder (1841-1902). These results are known collectively as Vor-
lesungen tiber die Algebra der Logik; they were published in the period from 1890 to
1895,
Further developments in the area saw an even more modern approach evolve in the work
of the German logician Gottlieb Frege (1848-1925) between 1879 and 1903. This work
significantly influenced the monumental Principia Mathematica (1910-1913) by England’s
Alfred North Whitehead (1861-1947) and Bertrand Russell (1872-1970). Here what was
begun by Boole was finally brought to fruition. Thanks to this remarkable effort and the work
of other twentieth-century mathematicians and logicians, in particular the comprehensive
Grundlagen der Mathematik (1934-1939) of David Hilbert (1862-1943) and Paul Bernays
(1888-1977), the more polished techniques of contemporary mathematical logic are now
available.
Several sections of this chapter stressed the importance of proof. In mathematics a proof
bestows authority on what might otherwise be dismissed as mere opinion. Proof embodies
the power and majesty of pure reason. But even more than that, it suggests new mathematical
ideas. Our concept of proof goes hand in hand with the notion of a theorem — a mathematical
statement the truth of which has been confirmed by means of a logical argument, namely, a
proof. For those who feel they can ignore the importance of logic and the rules of inference,
we submit the following words of wisdom spoken by Achilles in Lewis Carroll’s What the
Tortoise Said to Achilles: “Then Logic would take you by the throat, and force you to do
it!”
Comparable coverage of the material presented in this chapter can be found in Chapters
2 and 11 of the text by K. A. Ross and C. R. B. Wright [11]. The first two chapters of the
text by S. S. Epp [3] provide many examples and some computer science applications for
those who wish to see more on logic and proof at a very readable introductory level. The
text by H. Delong [2] provides an historical survey of mathematical logic, together with an
examination of the nature of its results and the philosophical consequences of these results.
This is also the case with the texts by H. Eves and C. V. Newsom [4], R. R. Stoll [13], and
R. L. Wilder [14], wherein the relationships among logic, proof, and set theory (the topic
of our next chapter) are examined in their roles in the foundations of mathematics.
For more on resolution (introduced in Exercise 13 of Section 2.3) and automated rea-
soning, the reader should examine the texts by J. H. Gallier [6] and M. R. Genesereth and
N. J. Nilsson [7].
The text by E. Mendelson [9] provides an interesting intermediate introduction for those
readers who wish to pursue additional topics in mathematical logic. A somewhat more
advanced treatment is given in the work of S. C. Kleene [8]. Accounts of other work in
mathematical logic are presented in the compendium edited by J. Barwise [1].
The objective of the works by D. Fendel and D. Resek [5] and R. P. Morash [10] is to
prepare the student with a calculus background for the more theoretical mathematics found
in abstract algebra and real analysis. Each of these texts provides an excellent introduction
to the basic methods of proof. The unique text by D. Solow [12] is devoted entirely to
introducing the reader who has a background in high school mathematics to the primary
techniques used in writing mathematical proofs.
120 Chapter 2. Fundamentals of Logic
REFERENCES
l. Barwise, Jon (editor). Handbook of Mathematical Logic. Amsterdam: North Holland, 1977.
2. Delong, Howard. A Profile of Mathematical Logic. Reading, Mass.: Addison-Wesley, 1970.
3. Epp, Susanna S. Discrete Mathematics with Applications, 2nd ed. Boston, Mass.: PWS Pub-
lishing Co., 1995.
. Eves, Howard, and Newsom, Carroll V. An Introduction to the Foundations and Fundamental
Concepts of Mathematics, rev. ed. New York: Holt, 1965.
. Fendel, Daniel, and Resek, Diane. Foundations of Higher Mathematics. Reading, Mass.:
Addison-Wesley, 1990.
. Gallier, Jean H. Logic for Computer Science. New York: Harper & Row, 1986.
. Genesereth, Michael R., and Nilsson, Nils J. Logical Foundations of Artificial Intelligence.
Los Altos, Calif: Morgan Kaufmann, 1987.
. Kleene, Stephen C. Mathematical Logic. New York: Wiley, 1967.
. Mendelson, Elliott. Introduction to Mathematical Logic, 3rd ed. Monterey, Calif.: Wadsworth
and Brooks/Cole, 1987.
. Morash, Ronald P. Bridge to Abstract Mathematics: Mathematical Proof and Structures. New
York: Random House/Birkhaitiser, 1987.
1] . Ross, Kenneth A., and Wright, Charles R. B. Discrete Mathematics, 4th ed. Upper Saddle
River, N.J.: Prentice-Hall, 1999.
12. Solow, Daniel. How to Read and Do Proofs, 3rd ed. New York: Wiley, 2001.
13. Stoll, Robert R. Set Theory and Logic. San Francisco: Freeman, 1963.
14, Wilder, Raymond L. Introduction to the Foundations of Mathematics, 2nd ed. New York:
Wiley, 1965.
7, a) For primitive statements p, q, find the dual of the state-
SUPPLEMENTARY EXERCISES ment (sp A 7g) V (Ty A p) V p.
b) Use the laws of logic to show that your result from
part (a) is logically equivalent to p A 7q.
1. Construct the truth table for
8. Let p,q, r, and s be primitive statements. Write the dual
pelqar)> As Vr). of each of the following compound statements.
a) (pV 7q) A(7r Vs)
2. a) Construct the truth table for
b) p> (¢A-7rdAs)
(p> gq) Apr). C) (PVT)IAGV Fol Vv [rAs
A To]
9. For each of the following, fill in the blank with the word
b) Translate the statement in part (a) into words such that
converse, inverse, or contrapositive so that the result is a true
the word “not” does not appear in the translation.
statement.
3. Let p,q, and r denote primitive statements. Prove or dis- a) The converse of the inverse of p—g is the
prove (provide a counterexample for) each of the following. of p> q.
a ipo qeoni=(pog er b) The converse of the inverse of p—g is the
b) [p> gon) elprg-r) of g > p.
4, Express the negation of the statement p <> q in terms of c) The inverse of the converse of p—>q_ is the
the connectives A and v. of p> g.
5. Write the following statement as an implication in two d) The inverse of the converse of p—q_ is the
ways, each in the if-then form: Either Kaylyn practices her piano of g > p.
lessons or she will not go to the movies, e) The inverse of the contrapositive of p—q is the
of p> q.
6. Let p, g, r denote primitive statements. Write the converse,
10. Establish the validity of the argument
inverse, and contrapositive of
a) p> (qAr) b) (pVq)>r (pos).
(p> q@Al@Ar)>sl]Ar]>
Supplementary Exercises 121
11. Prove or disprove each of the following, where p, g, andr 15. Suppose two opposite corner squares are removed from an
are any statements. 8 X 8 chessboard — as in part (a) of Fig. 2.4. Can the remaining
62 squares be covered by 31 dominos (rectangles consisting of
a) [(p¥q)
Yr] [pY¥ @XYr)]
two adjacent squares — one white and the other blue, as shown
b) [PY G@>r)]
= [(pY¢@) > (pYr)] in the figure)? (When a domino is placed on the chessboard, a
12. Write the following argument in symbolic form. Then ei- square of a given color need not be placed on a square of the
ther establish the validity of the argument or provide a counter- same color.)
example to show that it is invalid.
If it is cool this Friday, then Craig will wear his
suede jacket if the pockets are mended. The fore-
cast for Friday calls for cool weather, but the pock-
ets have not been mended. Therefore Craig won't
be wearing his suede jacket this Friday.
13. Consider the open statement
p(x. yt yx =ytx?
where the universe for each of the variables x, y comprises all i TPLu
integers. Determine the truth value for each of the following (a) (b)
statements.
Figure 2.4
a) p(0, 0) b) p(l, 1)
c) p(O, 1) d) Vy p(0, y) 16. In part (b) of Fig. 2.4 we have an 8 X 8 chessboard where
e) dy pl, y) f) Wx Ay p(x, y) two squares (one blue and one white) have been removed from
g) dy Vx p(x, y) h) Vy Ax p(x, y) each of two opposite corners. Can the remaining 60 squares be
covered by 15 T-shaped figures (of three white squares and one
14. Determine whether each of the following statements is true
blue one, or three blue squares and one white one— as shown
or false. If false, provide a counterexample. The universe com-
in the figure)? [The reader may wish to verify that a 4 x 4
prises all integers.
chessboard (of all 16 squares) can be covered by four of the
a) Vx dy Az (x = 7y + 5z) T-shaped figures. Then it follows that an 8 X 8 chessboard (of
b) Vx dy dz (x = 4y + 62) all 64 squares) can be covered by 16 of the T-shaped figures.}
Set Theory
Urns the mathematics we study in algebra, geometry, combinatorics, probabil-
ity, and almost every other area of contemporary mathematics is the notion of a set.
Very often this concept provides an underlying structure for a concise formulation of the
mathematical topic being investigated. Consequently, many books on mathematics have
an introductory chapter on set theory or mention in an appendix those parts of the theory
that are needed in the text. Here it may appear that, in opening the book with a chapter
on fundamentals of counting, we have neglected set theory. Actually we have relied on
intuition; each time the word collection appeared in Chapter 1, we were dealing with a set.
Also, in Sections 2.4 and 2.5, the notion of a set (if not the term itself) was invoked when
we dealt with the universe (of discourse) for an open statement.
Trying to define a set is rather difficult and often results in the circular use of such
synonyms as “class,” “collection,” and “aggregate.” When we first began the study of
geometry, we used our intuition to grasp the ideas of point, line, and incidence. Then we
started to define new terms and prove theorems, relying on these intuitive notions along
with certain axioms and postulates. In our study of set theory, intuition is invoked once
again, this time for the comparable ideas of element, set, and membership.
We shall find that the ideas we developed in Chapter 2 on logic are closely tied to set
theory. Furthermore, many of the proofs we shall study in this chapter draw on the ideas
developed in Chapter 2.
3.1
Sets and Subsets
We have a “gut feeling” that a set should be a well-defined collection of objects. These
objects are called elements and are said to be members of the set.
The adjective well-defined implies that for any element we care to consider, we are able
to determine whether it is in the set under scrutiny. Consequently, we avoid dealing with
sets that depend on opinion, such as the set of outstanding major league pitchers for the
1990s.
We use capital letters, such as A, B, C,..., to represent sets and lowercase letters to
represent elements. For a set A we write x € A if x is an element of A; y ¢ A indicates that
y is not a member of A.
A set can be designated by listing its elements within set braces. For example, if A is the set
EXAMPLE 3.1
consisting of the first five positive integers, then we write A = {1, 2, 3, 4, 5}. Here2€ A
but 6 ¢ A.
123
124 Chapter 3 Set Theory
Another standard notation for this set provides us with A = {x|x is an integer and 1 <
x <5}. Here the vertical line | within the set braces is read “such that.” The symbols {x| . . .}
are read “the set of all x such that. .. .” The properties following | help us determine the
elements of the set that is being described.
Beware! The notation {x|1 <x <5} is not an adequate description of the set A unless
we have agreed in advance that the elements we are considering are integers. When such an
agreement is adopted, we say that we are specifying a universe, or universe of discourse,
which is usually denoted by U. We then select only elements from U to form our sets. In this
particular problem, if denotes the set of all integers or the set of all positive integers, then
{x|1 <x <5} adequately describes A. If U is the set of all real numbers, then {x|1 < x <5}
would contain all of the real numbers between | and 5 inclusive; if U consists of only even
integers, then the only members of {x|1 <x <5} would be 2 and 4.
For U = {1, 2, 3, ...}, the set of positive integers, we consider the following sets. At the
EXAMPLE 3.2
same time we introduce various notations one may use to describe such sets.
a) A= {1,4,9,..., 64, 81} = {x7|x €U, x* < 100} = {x?|x EUA x? < 100}
b) B = (1,4, 9, 16} = {y?|y © U, y? < 20} = {y’|y EU, y? < 23}
= {y*|y ©UA y* < 16}.
c) C = (2,4, 6,8,...) = (2k|k EU}.
Sets A and B are examples of finite sets, whereas C is an infinite set. When dealing with
sets like A or C, we can either describe the sets in terms of properties the elements must
satisfy or list enough elements to indicate what is, we hope, an obvious pattern. For any
finite set A, |A| denotes the number of elements in A and is referred to as the cardinality,
or size, of A. In this example we find that |A| = 9 and | B| = 4.
Here the sets B and A are such that every element of B is also an element of A. This
important relationship occurs throughout set theory and its applications, and it leads to the
following definition.
Definition 3.1 If C, D are sets from a universe U, we say that C is a subset of D and write C C D, or
D2 C, if every element of C is an element of D. If, in addition, D contains an element that
1s not in C, then C is called a proper subset of D, and this is denoted by C C Dor DDC.
Note that for all sets C, D from a universe ‘U, if C C D, then
Vxl[xeC>xe€DI,
and if Vx [x¢€C 3x e€ D],thenC CD.
Here the universal quantifier Vx indicates that we should have to consider every element
x in the prescribed universe U. However, for each replacement c (from °U) where the
statement c € C is false, we know that the implication c € C + c € D is true, regardless of
the truth value of the statement c € D. Consequently, we actually need to consider only those
replacements c’ (from UL) where the statement c’ € C is true. If for each such c’ we find that
the statement c’ € D is also true, then we know that Vx [x € C > x € D] or, equivalently,
CCD.
Also, we find that for all subsets C, D of U,
CCDSCCD,
3.1 Sets and Subsets 125
and when C, D are finite,
CCDS (|C\|<|D|, and CCDS
|C| <|DI.
However, for U = {1, 2, 3, 4, 5}, C = {1, 2}, and D = {1, 2}, we see that C is a subset of
D (that is, C C D), but it is not a proper subset of D (or, C ¢ D). So, in general, we do not
findtha CC CD>CCD.
In an early version of ANSI (American National Standards Institute) FORTRAN, no distinc-
EXAMPLE 3.3 tion was made between uppercase and lowercase letters, and a variable name consisted of a
single letter followed by at most five characters (letters or digits). If U denotes the set of all
such variable names, then by the rules of sum and product, || = 26 + 26(36) + 26(36)? +
-++ + 26(36)° = 26 5°°_, 36' = 1,617,038,306. Thus, % is large, but still finite. An integer
variable in this programming language had to start with one of the letters I, J, K, L, M, N.
So if A denotes the subset of all integer variables in this early version of ANS] FORTRAN,
then |A| = 6 + 6(36) + 6(36)* + --- + 6(36)° = 6 )>_, 36' = 373,162,686.
The subset concept may now be used to develop the idea of set equality. First we consider
the following example.
For the universe U = {1, 2, 3, 4, 5}, consider the set A = {1, 2}. If B = {x|x? € U}, then
EXAMPLE 3.4
the members of B are 1, 2. Here A and B contain the same elements — and no other
element(s) — leading us to feel that the sets A and B are equal.
However, it is also true here that A C B and B C A, and we prefer to formally define
the idea of set equality by using these subset relations.
Definition 3.2 For a given universe U, the sets C and D (taken from UW) are said to be equal, and we write
C=D,whenC CDand DCC.
From these ideas on set equality, we find that neither order nor repetition is relevant fora
general set. Consequently, we find, for example, that {1, 2, 3} = {3, 1, 2} = {2, 2, 1, 3} =
{1, 2, 1, 3, 1}.
Now that we have defined the concepts of subset and set equality, we shall use the
quantifiers of Section 2.4 to examine the negations of these ideas.
For a given universe °U, let A, B be sets taken from U. Then we may write
ACBSe
Vx [xe ASxe BI.
From the (quantified) definition of A C B, we find that
A ¢ B (that is, A is not a subset of B)
Wx [xe Asaxe
B
2 dx-7fxeAs>xe
B]
<> Ax ->[-(x € A) Vx e€ B]
<= dx[xeAA7(€
B)]
<= dx[xcAAx
¢€ B].
126 Chapter 3 Set Theory
Hence A ¢ B if there is at least one element x in the universe where x is a member of A
but x is not a member of B.
In a similar way, because A = B+ ACBABCA, then
AFB AACBABCA)
SS 7(ACB)V-~>BCA)SALBVBEA.
Therefore two sets A and B are not equal if and only if (1) there exists at least one element
x in U where x € A but x ¢ B or (2) there exists at least one element y in WU where y € B
and y ¢ A—or perhaps both (1) and (2) occur.
We also note that for any sets C, D CU (that is, CCU and DCU),
CCODESCCDAC#D.
Now that we have introduced the four ideas of set membership, set equality, subset, and
proper subset, we shall consider one more example to see what these concepts tell us, as
well as what they do not tell us. Following this example, the proof of our first theorem for
this chapter will be fairly straightforward — because it readily follows from some of these
ideas.
Let U = {1, 2, 3, 4,5, 6, x, y, {1, 2}, {1, 2, 3}, {1, 2, 3, 4}} (where x, y are the 24th, 25th
EXAMPLE 5.5 lowercase letters of the alphabet and do not represent anything else, such as 3, 5, or {1, 2}).
Then || = 11.
a) IfA = {1, 2, 3, 4}, then |A| = 4 and here we have
i) ACY; ii) ACU; iii) AE U;
iv) {A} CU; v) {A} CU; but vi) {A} ¢U.
b) Now let B = {5, 6, x, y, A} = {5, 6, x, y, {1, 2, 3, 4}}. Then |B| = 5, nor 8. And now
we find that
i) ACB; ii) {A} C B; and iii) {A} CB.
But
iv) {A} ¢ B;
v) AZ B (that is, A is not a subset of B); and
vi) A ¢ B (that is, A is not a proper subset of B).
THEOREM 3.1 Let A, B,C CU.
a IfAC Band BCC,thenA
CC. b) IfAC BandBCC,thenACC.
a IfACBandBCC,then
ACC. d) If AC Band BCC,thenA
CC.
Before we prove this theorem we want to recall acomment we made back in Section 2.5. It
concerns our coverage of the Rules of Universal Specification and Universal Generalization
and appears after Example 2.56. For now it is appropriate in this new area on set theory.
When we want to prove, for example, thatx ¢€ A = x € C, we shall start by considering any
fixed but arbitrarily chosen element x in % — but we shall want this element x to be such
that “x € A” is a true statement (not an open statement). Then we must show that this same
fixed but arbitrarily chosen element x is also in C. The proofs we present are consequently
referred to as element arguments. Always remember that in these proofs x represents a fixed
but arbitrarily chosen element of A — and though x is generic (since it is not a specifically
named element in A), it does remain the same throughout each proof.
3.1 Sets and Subsets 127
Proof: We shall prove parts (a) and (b) and leave the remaining parts for the exercises.
a) To prove that A C C, we need to verify that for allx € ‘U, if x € A then x € C. We start
with anelementx from A.SinceA C B, x € Aimpliesx € B.ThenwithB CC,x eB
implies x € C.Sox € A implies x € C (by the Law of the Syllogism — Rule 2 in Table
2.19— since x € A, x € B, and x € C are statements), and A CC.
b) Since A C B,ifx € Athenx € B. With B CC, it then follows thatx € C,so A CC.
However, A C B => there exists an element b € B such that b ¢ A. Because B CC,
be B>beC. Thus ACC and there exists an element b€ C with b¢ A, so
ACC.
Our next example involves several subset relations.
LetU = {1, 2, 3, 4, 5} with A = {1, 2, 3}, B = {3, 4}, and C = {1, 2, 3, 4}. Then the fol-
EXAMPLE 3.6
lowing subset relations hold:
a) ACC b) ACC
ce) BCC d) ACA
e) BZA
f) A GA (that is, A is not a proper subset of A)
The sets A, B are just two of the subsets of C. We are interested in determining how
many subsets C has in total. Before answering, however, we need to introduce the set with
no members.
Definition 3.3 The null set, or empty set, is the (unique) set containing no elements. It is denoted by or { }.
We note that |@| = 0 but {0} 4 ¥. Also, J # {4} because {4} is a set with one element,
namely, the null set.
The empty set satisfies the following property given in Theorem 3.2. To establish this
property we use the method of proof by contradiction (or reductio ad absurdum). Following
the proof of Theorem 2.4 (in Section 2.5), we said that in establishing a theorem by this
method, we assumed the negation of the result and arrived at a contradiction. In our prior
work (as found in Example 2.32 and the third proof of Theorem 2.4), we arrived at a
contradiction of the formr A —r or p(m) A —p(m), respectively — where —r was a premise
in Example 2.32 and p(m) a specific instance of the hypothesis in Theorem 2.4. In proving
Theorem 3.2 things are now a little different. This time we shall find ourselves denying (or
contradicting) an earlier result we have accepted as true, namely, the definition of the null
set.
THEOREM 3.2 For any universe U, let A CU. Then @ C A, andif A # Y, then J C A.
Proof: If the first result is not true, then # ¢ A, so there is an element x from the universe
with x € B but x ¢ A. But x € # is impossible. So we reject the assumption @ & A and find
that 4 C A. In addition, if A # Y, then there is an element a € A (anda ¢%), soWCA.
128 Chapter 3 Set Theory
Returning now to Example 3.6 we determine the number of subsets of the set C = {1, 2,
| EXAMPLE 3.7 3, 4}. In constructing a subset of C, we have, for each member x of C, two distinct choices:
Either include it in the subset or exclude it. Consequently, there are 2 X 2 X 2 X 2 choices,
resulting in 2+ = 16 subsets of C. These include the empty set # and the set C itself. Should
we need the number of subsets of two elements from C, the result is the number of ways
two objects can be selected from a set of four objects, namely, C(4, 2) or (5). As a result,
the total number of subsets of C, 2*, is also the sum (5) + (7) + (3) + (3) + (@), where the
first summand is for the empty set, the second summand for the four singleton subsets, the
third summand for the six subsets of size 2, and so on. So 2* = }of_4 (2).
Definition 3.4 If A is a set from universe U, the power set of A, denoted (A),” is the collection (or set)
of all subsets of A.
For the set C of Example 3.7, P(C) = {@, {1}, {2}, {3}, {4}, (1, 2}, (1, 3}, (1, 4, (2, 3},
EXAMPLE 3.8
{2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, (1, 3, 4}, (2, 3, 4}, C}.
For any finite set A with |A| = n > 0, we find that A has 2” subsets and that |P(A){ = 2”.
For any 0< k <n, there are (7) subsets of size k. Counting the subsets of A according
to the number, k, of elements in a subset, we have the combinatorial identity
(6) + G) + G+--+(= exo) = 2". forn > 0.
This identity was established earlier in Corollary 1.1 (a). The presentation here is another
example of a combinatorial proof because the identity is established by counting the same
collection of objects (subsets of A) in two different ways.
A systematic way to represent the subsets of a given nonempty set can be accomplished
by using a coding scheme known as a Gray code. This is demonstrated in our next example.
| EXAMPLE 3.9 Consider the binary strings (of 0’s and 1’s) in Fig. 3.1. In particular, examine the first column
of the strings in part (b). How did this column come about? First we see 0, then 1 as
— in
part (a) of the figure. Then we see 1 followed by 0 — the reverse order (from bottom to top)
of the two binary strings in part (a). Once we obtain the first column for the binary strings
in part (b), we then list two 0’s followed by two 1’s.
Continuing with the strings in part (c) of the figure, now we concentrate on the first two
columns. The first four entries (binary strings of length 2) are precisely the four strings
in part (b). The last four entries (again, binary strings of length 2) are likewise the binary
strings in part (b)— now in reverse order (from bottom to top). For these eight strings of
length 2, we append 0 to the right of the first four and | to the nght of the last four.
For each Gray code in parts (a), (b), (c) of the figure, as we go from one binary string (in
a column) to the next binary string (in that column), there is exactly one bit that changes.
For instance, in part (b), in going from 10 to 11, we find one change (from 0 to 1) in the
second position. Furthermore, for the third and fourth strings in part (c), as we go from
"In some computer science textbooks the reader may find the notation 24 used for P(A).
3.1 Sets and Subsets 129
g 00|0 g 000 000 000
x) 10 | 0 {x} 100 010 001
(a) 11 0 {x, y} 110 011 101
0110 ty) 010 001 100
0| o g o1l 1 yz O11 101 110
110 {x} 1141 ix,y, 2} 11 111 010
1 1 {x, y} 10 1 {x, z} 101 110 011
0} 1 ty} oo | 1 tz] 001 100 14
(b) (c) (d) (e) (f)
Figure 3.1
110 to O10, there is exactly one change — from | to 0 in the first position. The fourth and
fifth strings have the one change from 0 to ] —this time in the third position. Also notice
how the first and last strings for each code differ in the last position. Part (d) of the figure
demonstrates this for the strings of length 3.
This technique, for constructing a Gray code for the strings of length 2 from those of
length | and the strings of length 3 from those of length 2, is an example of a recursive
construction. (This idea will be examined in more detail] in Section 4.2.)
When we examine each Gray code in parts (a), (b), (c) of Fig. 3.1, we see a listing of
subsets to the right of each of these codes. For example, in part (b), if we start with the set
A = {x, y} and keep the order of the elements fixed,’ then we can list the subsets of A in
terms of binary strings of length 2. We write 0 for an element when it is not in the subset and
1 when it is. Hence the subset {x} is encoded as 10 because the “first” element x (of ordered
set A) is in the subset, while the “second” element y (of ordered set A) is not present — as
the 0, in 10, indicates. For part (c), the (ordered) set B = {x, y, z} has its eight subsets listed
next to the elements of the Gray code. As we go from one subset to the next (in a given
column), we see that there is exactly one change in the makeup of the subset. For instance,
in going from {x, y} (110) to {y¥} (010), exactly one element is deleted — as indicated by
the change from | to 0 in the first positions of 110 and 010. Likewise, as we go from {z}
(O01) to 4 (000), exactly one element is deleted —the change from | to 0, in the third bits
of 001 and 000, indicates this. Examining the change from {y, z} (011) to {x, y, z} (1),
we see that one new element is added — here it is x. The change from 0 to 1 as we go from
011 to 111 takes this into account.
Note that the first four subsets in part (c) are the four subsets in part (b). Further, the last
four subsets in part (c) come about from the same four subsets in part (b) —this time in
reverse order and with the element z included in each subset.
The recursive construction given here shows how we can continue to develop Gray codes
for binary strings of longer length. When this coding scheme was introduced— just prior
to the start of this example — we spoke of it as a Gray code, not as the Gray code. Other
Gray codes are possible. The code in part (e) of Fig. 3.1 provides a second Gray code for
the eight binary strings of length 3. Furthermore, if we no longer require the first and last
entries in a code to differ in only one position, then the code in part (f) of Fig. 3.1 would
also serve as a Gray code for the eight binary strings of length 3.
“Originally we considered the elements of a set as unordered, so we are making an exception here. In textbooks
dealing with data structures, such ordered sets are often referred to as /ists and one finds, for instance, the ordered
set {x, y, z} denoted by [x, y, z] or (x, y, z).
130 Chapter 3 Set Theory
The ability to count certain, or all, subsets of a given set provides a second approach for
the solution of two of our earlier examples.
EXAMPLE 3.10_| In Example 1.14, we counted the number of (staircase) paths in the x y-plane from (2, 1) to
(7, 4) where each such path is made up of individual steps going one unit to the right (R)
or one unit upward (U). Figure 3.2 is the same as Fig. 1.1, where two of the possible paths
are indicated.
»
1 2 3 4 5 6 7 1 2 3 4 5 6 7
(a) R,U,R,R,U,R,R,U {b) U,R,R,R,U,U,R,R
Figure 3.2
The path in Fig. 3.2(a) has its three upward (U) moves located in positions 2, 5, and 8
of the list at the bottom of the figure. Consequently, this path determines the three-element
subset {2, 5, 8} of the set {1, 2, 3,..., 8}. In Fig. 3.2(b) the path determines the three-
element subset {1, 5, 6}. Conversely, if we start, for example, with the subset {1, 3, 7} of
{1, 2, 3, ..., 8}, then the path that determines this subset is given by U, R, U, R, R, R,
U,R.
Consequently, the number of paths sought here equals the number of subsets A of
8 8!
{1, 2, 3,..., 8}, where |A| = 3. There are (3) = 3151 = 56 such paths (and subsets),
as we found in Example 1.14.
If we had considered the moves R to the right, instead of the upward moves U, we would
have found the answer to be the number of subsets B of {1, 2, 3,..., 8}, where |B| =5.
8 8!
There are ( :) = 531 = 56 such subsets. (The idea presented here was examined earlier
for the result developed in Table 1.4.)
In part (b) of Example 1.37 of Section 1.4 we learned that there are 2° compositions for the
EXAMPLE 3.11
integer 7 — that is, there are 2° ways to write 7 as a sum of one or more positive integers,
where the order of the summands is relevant. The result we obtained there used the binomial
theorem in conjunction with the answers for seven cases that were summarized in Table 1.9.
Now we shall obtain this result in a somewhat different and easier way.
First consider the following composition of 7:
Io o+ 1 06+06~«<wWTSC COW Hd YH dD I
1 4 1 +
Ist plus 2nd plus tee 5th plus 6th plus
sign sign sign sign
Here we have seven summands, each of which is 1, and six plus signs.
3.1. Sets and Subsets 131
For the set {1, 2, 3, 4, 5, 6} there are 2° subsets. But what does this have to do with the
compositions of 7?
Consider a subset of {1, 2, 3, 4, 5, 6}, say {1, 4, 6}. Now form the following composition
of 7:
Q+1)4+41+4+ d041 4 042)
J 1 1
Ist plus 4th plus 6th plus
sign sign sign
Here the subset {1, 4, 6} indicates that we should place parentheses around the 1’s on either
side of the first, fourth, and sixth plus signs. This results in the composition
24+142+2.
If the same way we find that the subset {1, 2, 5, 6} indicates the use of the first, second,
fifth, and sixth plus signs, giving us
gd+1 41
+ 1+ d+ 1421)
+ 4 1 1
Ist plus 2nd plus 5th plus 6th plus
sign sign sign sign
or the composition 3 + 1 + 3.
Going in reverse we see that the composition | + 1 + 5 comes from
1+14+(+1+14+141)
and is determined by the subset {3, 4, 5, 6} of {1, 2, 3, 4, 5, 6}. In Table 3.1 we have listed
six compositions of 7 along with the corresponding subset of {1, 2, 3, 4, 5, 6} that deter-
mines each of them.
Table 3.1
Composition of 7 Determining Subset of {1, 2, 3, 4, 5, 6}
(i) 1+14+1+14+141+41 (1) Yi
(ii) 14241414141 (11) {2}
(iii) 1+14+34141 (ii1) {3, 4}
(iv) 24+3+2 (iv) {1, 3, 4, 6}
(v) 4+3 (v) {1, 2, 3, 5, 6}
(vi) 7 (vi) {1, 2, 3, 4, 5, 6}
The examples we have obtained here indicate a correspondence between the composi-
tions of 7 and the subsets of {1, 2, 3, 4, 5, 6}. Consequently, once again we find that there
are 2° compositions of 7. In fact, for each positive integer m, there are 2”—! compositions
of m.
Out next example yields another important combinatorial identity.
For integers n, r withn >r > 1,
EXAMPLE 3.12
(P)=(+(4)
132 Chapter 3 Set Theory
Although this result can be established algebraically from the definition of (") as
n!/(r!(n —r)!), we use a combinatorial approach. Let A = {x, a), do, ..., @,} and con-
sider all subsets of A that contain r elements. There are (” > ') such subsets. Each of these
falls into exactly one of the following two cases: those subsets that contain the element x
and those that do not. To obtain a subset C of A, where x € C and |C| = r, place x in C and
then select r — 1 of the elements a), a2, .. . , 4). This can be done in (,” ,) ways. For the
other case we want a subset B of A with |B| =r and x ¢ B. So we select r elements from
among 41, @2, ..., G,, Which we can do in (") ways. It then follows by the rule of sum that
(“Fy = 0) +6").
Before we proceed any further let us reconsider the result of Example 3.12, but this time
we Shall do it in light of what we learned in Example 3.10.
Once again we let n, r be positive integers where n >r > 1. Then ("{') counts the
number of (staircase) paths in the xy-plane from (0, 0) to (n+ 1—~r,r), where, as in
Example 3.10, each such path has
(n+1)—-r horizontal moves of the form (x, y) > (x + 1, y), and
r vertical moves of the form (x, y) > (x, y+ 1).
The last edge in each of these (staircase) paths terminates at the point (n + 1 —7, r) and
starts at either (1) the point (”n — r, r) or (11) the point (n + 1 —r,r — 1).
In case (i) we have the last edge horizontal, namely, (n ~ r,r) > (0 + 1 —,r, 1); the
number of (staircase) paths from (0, 0) to (n —r,r) is (“~7)*") = ("). For case (ii) the
last edge is vertical, namely, (n + 1 —r,r — 1) > (n + 1 — +r, r); the number of (staircase)
paths from (0, 0) to(n +1 —r,r —1)is(“T! PF) = (," 4). Since these two cases
exhaust all possibilities and have nothing in common, it follows that
Cr) OC)
r r r—-l
We now investigate how the identity of Example 3.12 can help us solve Example 1.35,
EXAMPLE 3.13
where we sought the number of nonnegative integer solutions of the inequality xj + x2 +
s+ + x6 < 10.
For each integer k, 0 <k <9, the number of solutions to x) +x. +---+.x%6 =k is
(° tee ') = C L “\. So the number of nonnegative integer solutions to x; + x2 +:+- +26 <
()-()-0)-0)--60)
10 is
[0-10-06 =O--6
[0)-C-G=6) ™6)-0)-0
[9-O)-@-0) 0-0
[0)-Ol-C)= 0-0) =
3.1 Sets and Subsets 133
In Fig. 3.3 we find a part of the useful and interesting array of numbers called Pascal's
EXAMPLE 3.14
triangle
(n = 0)
(n= 1)
(n = 2)
(n = 3)
(n = 4)
(n = 5) (3
Figure 3.3
Note that in this partial listing the two triangles shown satisfy the condition that the
binomial coefficient at the bottom of the inverted triangle is the sum of the other two terms
in the triangle. This result follows from the identity in Example 3.12.
When we replace each of the binomial coefficients by its numerical value, the Pascal
triangle appears as shown in Fig. 3.4.
(n = 0) 1
(n= 1) 1 1
(n = 2) 1 2 1
(n = 3)
(n = 4)
(n=5) 1
Figure 3.4
There are certain sets of numbers that appear frequently throughout the text. Conse-
quently, we close this section by assigning them the following designations.
a) Z = the set of integers = {0, 1, —1, 2, —2, 3, —3,...}
b) N = the set of nonnegative integers or natural numbers = {0, 1, 2,3, ...}
c) Z* = the set of positive integers = {1, 2,3,...} = {x EZ x > 0}
d) Q = the set of rational numbers = {a/b | a,b €Z, b # 0}
e) Qt = the set of positive rational numbers = {r ¢ Q| r > 0}
f) Q* = the set of nonzero rational numbers
g) R = the set of real numbers
134 Chapter 3 Set Theory
h) R* = the set of positive real numbers 5
i) R* = the set of nonzero real numbers
p) C = the set of complex numbers = {x + yi| x,y ER, 7? = —1}
k) C* = the set of nonzero complex.numbers
1) For eachn € Z*, Z, = {0,1,2,...,2—1}
m) For real numbers
a, b witha < b, fa, bh] = {x ER a<x < dB},
(a, b)={xER{a<x <b}, [a,b) = {x eR}a<x < d},
(a, b] = {x ER | a <x <b}. The first set is called a closed interval, the second
set an open interval, and the other two sets half-open intervals.
¢) proper subsets of A
EXERCISES 3.1
d) nonempty proper subsets of A
1. Which of the following sets are equal? e) subsets of A containing three elements
a) {1, 2, 3} b) {3, 2, 1, 3} f) subsets of A containing 1, 2
c) {3, 1, 2, 3} d) {1, 2, 2, 3} g) subsets of A containing five elements, including 1, 2
2. Let A = {1, {1}, {2}}. Which of the following statements h) subsets of A with an even number of elements
are true? i) subsets of A with an odd number of elements
a) lea b) {1} eA 9, a) Ifa set A has 63 proper subsets, what is |A|?
ce) {I}CA d) {{1}} OA b) Ifaset B has 64 subsets of odd cardinality, what is | B|?
e) {2}A f) {2} A ¢) Generalize the result of part (b)
8) {{2}} OA h) {{2}} CA
10. Which of the following sets are nonempty?
3. For A = {1, 2, {2}}, which of the eight statements in Exer-
cise 2 are true? a) {x|x €N, 2x +7
= 3}
4. Which of the following statements are true? b) {fx € Z[3x+5=9}
a) Hed b4c¥ agjgcg c) {xjx €Q, x7 +4 =6}
d) H {8} e) Ac {B} f) AC {B} d) {x €R|x?+4=6}
5. Determine all of the elements in each of the following sets. e) (x ER] x2 +3x4+3=0}
a) {1+ (-1)"|neN} f) {x|x €C, x7 + 3x43
=0}
b) {2 + (1/n)| n € {1, 2, 3, 5, 7}} 11. When she is about to leave a restaurant counter, Mrs. Al-
c) {n> + n?|n € {0, 1, 2, 3, 4}} banese sees that she has one penny, one nickel, one dime, one
quarter, and one half-dollar. In how many ways can she leave
6. Consider the following six subsets of Z.
some (at least one) of her coins for a tip if (a) there are no re-
A= {2m+1|meZ} B= {2n+3\neZ} strictions? (b) she wants to have some change left? (c) she wants
C = {2p—3| pe Z} D = {3r+1|reZ} to leave at least 10 cents?
E = {3s + 2| s € Z} F = {3t —2|t€Z} 12, LetA = {1, 2, 3, 4, 5, 7, 8, 10, 11, 14, 17, 18}.
Which of the following statements are true and which are false? a) How many subsets of A contain six elements?
a) A=B b) A=C c) B=C b) How many six-element subsets of A contain four even
d) D=E e) D=F f) E=F integers and two odd integers?
7. Let A, B be sets from a universe U. (a) Write a quan- ¢) How many subsets of A contain only odd integers?
tified statement to express the proper subset relation A C B.
13. Let § = {1, 2, 3,..., 29, 30}. How many subsets A of S
(b) Negate the result in part (a) to determine when A ¢ B.
satisfy (a) |A| = 5? (b) |A| = 5 and the smallest element in A
8. For A = {1, 2, 3, 4, 5, 6, 7}, determine the number of is 5? (c) |A| = 5 and the smallest element in A is less than 5?
a) subsets of A 14, a) How many subsets of {1, 2, 3, ..., 11} contain at least
b) nonempty subsets of A one even integer?
3.1 Sets and Subsets 135
b) How many subsets of {1, 2, 3, ..., 12} contain at least 20. a) Among the strictly increasing sequences of integers that
one even integer? start with 1 and end with 7 are:
c) Generalize the results of parts (a) and (b). i) 1,7 ii) 1,3, 4,7 iii) 1,2, 4,5, 6,7
15. Give an example of three sets W, X, Y such that W € X How many such strictly increasing sequences of integers
and X < Y but W ¢ Y. Start with 1 and end with 7?
16. Write the next three rows for the Pascal triangle shown in b) How many strictly increasing sequences of integers start
Fig. 3.4 with 3 and end with 9?
17. Complete the proof of Theorem 3.1. c) How many strictly increasing sequences of integers start
with 1 and end with 37? How many start with 62 and end
18. For sets A, B, C CU, prove or disprove (with a counter-
with 98?
example), the following: If AC B, BZ C, then A ZC.
d) Generalize the results in parts (a) through (c).
19. In part (i) of Fig. 3.5 we have the first six rows of Pascal’s
triangle, where a hexagon centered at 4 appears in the last three 21. One quarter of the five-element subsets of {1, 2, 3,..., 7}
rows. If we consider the six numbers (around 4) at the vertices of contain the element 7. Determine n (> 5).
this hexagon, we find that the two alternating triples — namely, 22. For a given universe U, let ACU where A is finite
3, 1, 10 and 1, 5, 6— satisfy 3- 1-10 = 30 = 1-5. 6. Part (41) with |9(A)| =n. If B CU, how many subsets does B have,
of the figure contains rows 4 through 7 of Pascal’s triangle. Here if (a) B= AU{x}, where x EU — A? (b) B=AU {x, y},
we find a hexagon centered at 10, and the alternating triples where x, ye U-— A? (c) B= AU {x), %,..., x}, where
at the vertices —in this case, 4, 10, 15 and 6, 20, 5 — satisfy X},X2,...,X%, € U-— A?
4-10-15 = 600 = 6. 20-5. 23. Determine which row of Pascal’s triangle contains three
a) Conjecture the general result suggested by these two consecutive entries that are in the ratio 1 : 2: 3.
examples. 24. Use the recursive technique of Example 3.9 to develop a
b) Verify the conjecture in part (a). Gray code for the 16 binary strings of length 4. Then list each
of the 16 subsets of the ordered set {w, x, y, z} next to its cor-
responding binary string.
25. Suppose that A contains the elements v, w, x, y, z and no
others. If a given Gray code for the 32 subsets of A encodes the
ordered set {v, w} as 01100 and the ordered set {x, y} as 10001,
write A as the corresponding ordered set.
1 2 1 26. For positive integers n, r show that
1 3 (“ere ') ("*") (ere)
= + +
r r r—1
1 4
1 n+2 + n+1 + n
1 5 10 1 2 1 0
4
n+r n+r—1
(1) n n
n+2 n+l] n
+ + + .
n n n
1 3 3 1
27. In the original abstract set theory formulated by Georg Can-
tor (1845-1918), a set was defined as “any collection into a
whole of definite and separate objects of our intuition or our
thought.” Unfortunately, in 1901, this definition led Bertrand
Russell (1872-1970) to the discovery of a contradiction —a re-
sult now known as Russell's paradox — and this struck at the
very heart of the theory of sets. (But since then several ways
have been found to define the basic ideas of set theory so that
this contradiction no longer comes about.)
Russell’s paradox arises when we concern ourselves with
Figure 3.5 whether a set can be an element of itself. For example, the set
136 Chapter 3 Set Theory
of all positive integers is not a positive integer—or Z* ¢ Z*. a) Write a computer program (or develop an algorithm) to
But the set of all abstractions is an abstraction. generate a random six-element subset of A.
Now in order to develop the paradox let S be the set of b) For B = {2, 3,5, 7, 11, 13, 17, 19, 23, 29, 31, 37},
all sets A that are not members of themselves —that is, § = write a computer program (or develop an algorithm) to gen-
{A|A isasetA A ¢ A}. erate a random six-element subset of A and then determine
a) Show that if S € S, then S ZS. whether it is a subset of B.
b) Show that ifS ¢ S, then S € S. 29. Let A = {1, 2,3,..., 7}. Write a computer program (or
The results in parts (a) and (b) show us that we must avoid develop an algorithm) that lists all the subsets B of A, where
trying to define sets like $. To do so we must restrict the types |B] = 4.
of elements that can be members of a set. (More about this is 30. Write a computer program (or develop an algorithm) that
mentioned in the Summary and Historical Review in Section lists all the subsets of {1, 2, 3, ..., 2}, where 1 <n < 10. (The
3.8.) value of n should be supplied during program execution.)
28. Let A = {1, 2,3,..., 39, 40}.
3.2
Set Operations and the Laws of Set Theory
After learning how to count, a student usually faces methods for combining counting num-
bers. First this is accomplished through addition. Usually the student’s world of arithmetic
revolves about the set Z* (or a subset of Z* that can be spoken and written about, as well
as punched out on a hand-held calculator) wherein the addition of two elements from Z*
results in a third element of Z*, called the sum. Hence the student can concentrate on addi-
tion without having to enlarge his or her arithmetic world beyond Z*. This is also true for
the operation of multiplication.
The addition and multiplication of positive integers are said to be closed binary op-
erations on Z*. For example, when we compute a +b, for a,b €Z*, there are two
operands, namely, a and b. Hence the operation is called binary. And since a +b € Zt
when a, b € Z*, we say that the binary operation of addition (on Z*) is closed. The binary
operation of (nonzero) division, however, is not closed for Z* — we find, for example, that
1/2(= 1+2) ¢ Z*, even though 1, 2 € Z*. Yet this operation is closed when we consider
the set Q* instead of the set Z*.
We now introduce the following binary operations for sets.
Definition 3.5 For A, B, CU we define the following:
a) AU B (the union of A and B) = {x|x Ee AV x € B}.
b) AN B (the intersection of A and B) = {x|x Ee AAx € B}.
c) AA B (the symmetric difference of Aand B) = {x|(x EAVXEB)Ax€ANB)=
{xjxE AUBAXEANB}.
Note that if A, B CU, then AU B, AN B, AA B CU. Consequently, U, , and A are
closed binary operations on P(W), and we may also say that P(U) is closed under these
(binary) operations.
WithUW = {1, 2, 3,..., 9, 10}, A = {1, 2, 3, 4, 5}, B = (3, 4, 5, 6, 7}, and C = {7, 8, 9},
EXAMPLE 3.15
we have:
a) AN B = (3,4, 5} b) AU B = {1, 2, 3, 4, 5, 6, 7}
3.2 Set Operations and the Laws of Set Theory 137
c) BNC = {7} d) ANC=B
e) AA B= (I, 2,6, 7} f) AUC ={1, 2, 3, 4,5, 7, 8, 9}
g) AAC ={1, 2,3, 4,5, 7,8, 9}
In Example 3.15 we see that AM BC A CAU B. This result is not special for just this
example but is true in general. The result follows because
XEANBS(XEAAXEB)SXEA
(by the Rule of Conjunctive Simplification — Rule 7 of Table 2.19), and
xEAD(XKEAVXEB)D>DSXEAUB
(where the first logical implication is a result of the Rule of Disjunctive Amplification—
Rule 8 of Table 2.19),
Motivated by parts (d), (f), and (g) of Example 3.15, we introduce the following general
ideas.
Definition 3.6 Let S, T CU. The sets S and 7 are called disjoint, or mutually disjoint, when SO T = G.
THEOREM 3.3 If S, 7 CU, then S and 7 are disjoint if and only if SUT =SAT.
Proof: We start with $, 7 disjoint. (To prove that SUT = S AT we use Definition 3.2.
In particular, we shall provide two element arguments, one for each inclusion.) Consider
each x in UU. If x e SUT, then x € S or x €T (or perhaps both). But with S and T
disjoint, x SOT so x € S AT. Consequently, because x ¢ SUT implies x e SAT,
we have SUT CS AT. For the opposite inclusion, if ye $ AT, then ye S or ye T.
(But y ¢ SMT; we don’t actually use this here.) So y¢ SUT. Therefore S AT CSUT.
And now that we have SUT CSAT andSATCSUT, it follows from Definition 3.2
that SAT = SUT.
We prove the converse by the method of proof by contradiction. To do so we consider
any S, T CU and keep the hypothesis (that is, that SU 7 = S A T) as is, but we assume
the negation of the conclusion (that is, we assume that S and T are not disjoint). So if
SOT #G@,letxe SOT.Thenx
€ Sandx €7,sox eS UT and
xXxESAT(=SUT).
But when x € SUT andx € ST, then
xE¢SAT.
From this contradiction —namely, x € SAT A x ¢ S A T —we realize that our original
assumption was incorrect. Consequently, we have S and 7 disjoint.
In proving the first part of Theorem 3.3 we showed that if S$, 7 are any sets, then
SAT CSUT. The disjointness of § and T was needed only for the opposite inclusion.
After mastering the skill of addition, one usually comes next to subtraction. Here the set
N causes some difficulty. For example, N contains 2 and 5 but 2 — 5 = —3, and -3 €N.
Therefore the binary operation of subtraction is not closed for N, although it is closed for
138 Chapter 3 Set Theory
the superset Z of N. So for Z we can introduce the unary, or monary, operation of negation
where we take the “minus” or “negative” of a number such as 3, getting —3.
We now introduce a comparable unary operation for sets.
Definition 3.7 Foraset A C U, the complement of A, denotedU — A, or A, is given by {x|x € UA x ¢ A}.
EXAMPLE 3.16 ,
For the sets of Example Example 3.15,
3.15, A = {6, 7, 8,9, 10}, B=
B = {1, 2, 8, 9, 10}, and C=
C = {1, 2, 3,
4,5, 6, 10}.
For every universe U and every set A CU, we find that A CU. Therefore PAL) is
closed under the unary operation defined by the complement.
The following concept is related to the concept of the complement.
Definition 3.8 For A, B CU, the (relative) complement of A in B, denoted B — A, is given by
{x|x E BAx € A}.
EXAMPLE 3.17 For the sets of Example 3.15 we have:
a) B—A= {6,7} b) A — B = {1, 2} c) A~C=A
d)C-A=C ey) A-A=G f)}U-A=A
In order to motivate our next theorem, we first consider the following.
EXAMPLE 3.18 For UW = R, let A = [1, 2] and B = [1, 3). Then we find that
a) A= {x|l <x <2) C{x|lL<x<3)=B8B
b) AUB={x|L<x<3}=B8B
QANB={x|l<x<2}=A
d) B = (00, 1) U[3, +00) C (—00, 1) U (2, +00) = A
This next theorem now shows us that the four results in Example 3.18 are related in
general. In order to prove this theorem we again make use of Definition 3.2, as we discover
the interplay between the notions of subset, union, intersection, and complement.
THEOREM 3.4 For any universe U and any sets A, B CU, the following statements are equivalent:
a) ACB b) AUB=B
c) ANB=A d)BCA
Proof: In order to prove the theorem, we prove that (a) > (b), (b) > (c), (c) > (d), and
(d) => (a). (The reason this suffices to prove this theorem is based on the idea presented in
Exercise 13 at the end of Section 2.2.)
3.2 Set Operations and the Laws of Set Theory 139
i) (a)>(b) IfA, Bare any sets, then B C A U B (as mentioned after Example 3.15).
For the opposite inclusion, ifx € A U B, then x € A or x € B, but since A C B, in
either case we have x € B. So AU B C B and, since we now have both inclusions,
it follows (once again from Definition 3.2) that AU B = B.
ii} (b) > (c) Given sets A, B, we always have A D> A/ B (as mentioned after Ex-
ample 3.15). For the opposite inclusion, let y¢ A. WithAUB=B,ycAS>ye
AUB > ye B(sinceAU B= B) > ye AN B,soA CAN B and we conclude
that A= ANB.
iii) (c) > (d) We know that ze B >z¢ B. Now if z€ ANB, then z€ B, since
AB CB. The contradiction — namely, z ¢ B Az € B —tells us thatz € ANB.
Therefore, z ¢ A because AN B = A. Butz ¢A>z€A,SOBCA.
iv) (d)>(a) Last,weASwé¢ A. lfw ¢ B,thenw e B. WithB C A it then follows
that w € A. This time we get the contradiction w ¢ A Aw €A,and this tells us that
wéeB.HenceACB.
With a bit of theorem proving under our belts, we now introduce some of the major laws
that govern set theory. These bear a marked resemblance to the laws of logic given in Section
2.2. In many instances these set theoretic laws are similar to the arithmetic properties of the
real numbers, where “U”’ plays the role of “++” and “M” the role of “X.” However, there are
several differences.
The Laws of Set Theory
For any sets A, B, and C taken from a universe U
A=A Law of Double Complement
2) AUB=ANB - DeMorgan’s Laws
ANB =AuUB
3AUB=BUA Commutative Laws
ANB=BNA
4 AU(BUC)=(AU BUC Associative Laws
AN(BNC)=(AN BNC
S)AU(BNO)=(AUBN(AUC) Distributive Laws
AN(BUC)=(AN BYU(ANC)
6} AUA=A idempotent Laws
ANA=A -
7) AUS=A Identity Laws
ANUW=A
8) AUA = % Inverse Laws
ANA=@
9) AUU = UY Domination Laws
ANG=86
10) AU(ANB=A Absorption Laws
AN(AUB)=A
140 Chapter 3 Set Theory
All these laws can be established by element arguments, as in the first part of the proof
of Theorem 3.3. We demonstrate this by establishing the first of DeMorgan’s Laws and the
second Distributive Law, that of intersection over union.
Proof: Let x € U. Then
x€AUB>x¢AUB
=>x¢éAandx¢éB
>xeAandxeB
=>xeEAN B,
so UB
A CAN B. Toestablish the opposite inclusion, we check to see that the converse of
each logical implication is also a logical implication (that is, that each logical implication
is, in fact, a logical equivalence). As a result we find that
x€ANBSxc€AandxeB
=>xéAandx¢éB
>x€AUB
>xeAUB.
Therefore AM B C AU B. Consequently, with A UB CAN BandAN BCA UB, itfol-
lows from Definition 3.2 that AUB = ANB.
In our second proof, we shall establish both subset relations simultaneously by using the
logical equivalence (<=) as opposed to the logical implications (= and <).
Proof: For eachx € U,
XEAN(BUC)
S&S (EA) and(xe BUC)
<> (x € A) and (x € Borx eC)
<= (xe Aandxe B)or(xe
Aandx eC)
SX EANB)or(xEe
ANC)
Sx E(ANB)U(ANC).
As we have equivalent statements throughout, we have established both subset relations
simultaneously, so AM (BUC) = (AN B)U(ANC). (The equivalence of the third and
fourth statements follows from the comparable principle in the laws of logic — namely, the
Distributive Law of conjunction over disjunction.)
The reader undoubtedly expects the pairing of the laws in items 2 through 10 to have
some importance. As with the laws of logic, these pairs of statements are called duals. One
statement can be obtained from the other by replacing all occurrences of U by M and vice
versa, and all occurrences of U by 4 and vice versa.
This leads us to the following formal idea.
Definition 3.9 Let s be a (general) statement dealing with the equality of two set expressions. Each such
expression may involve one or more occurrences of sets (such as A, A, B, B, etc.), one or
more occurrences of % and °U, and only the set operation symbols M and U. The dual of s,
denoted s4 is obtained from s by replacing (1) each occurrence of % and U (in s) by U and
J, respectively; and (2) each occurrence of N and U (in s) by U and N, respectively.
3.2 Set Operations and the Laws of Set Theory 141
As in Section 2.2, we shall state and use the following theorem. We shall prove a more
general result in Chapter 15.
THEOREM 3.5 The Principle of Duality. Let s denote a theorem dealing with the equality of two set
expressions (involving only the set operations M and U as described in Definition 3.9). Then
s?, the dual of s, is also a theorem.
Using this principle cuts our work down considerably. For each pair of laws in items
2 through 10, one need prove only one of the statements and then invoke this principle to
obtain the other statement in the pair.
We must be careful about applying Theorem 3.5. This result cannot be applied to par-
ticular situations but only to results (theorems) about sets in general. For example, let
us consider the particular situation where U = {1, 2, 3, 4, 5} and A = {1, 2, 3, 4}, B=
{1, 2, 3, 5}, C = {1, 2}, and D = {1, 3}. Under these circumstances
AN B=({1,
2,3} =CUD.
However, we cannot infer that s; AN B=CUD=s3s4:AUB=CQOD. For here
AU B = {1, 2, 3, 4, 5}, whereas C M D = {1}. The reason why Theorem 3.5 is not appli-
cable here is that although AN B = C UD in this particular example, it is not true in
general (that is, for any sets A, B, C, and D taken from a universe VU).
Inasmuch as Definition 3.9 and Theorem 3.5 do not mention anything about subsets, can
EXAMPLE 3.19
we find a dual for the statement A C B (where A, B CU)?
Here we get an opportunity to use some of the results in Theorem 3.4. We can deal with
the statement A C B by using the equivalent statement A U B = B.
The dualofA U B = BgivesusAM B= B.ButAN B= B <> BCA. Consequently,
the dual of the statement A C B is the statement B C A. (We could also have obtained this
result by using AC B <> AN B=A.,)
When we consider the relations that may exist among the sets that are involved in a
set-equality or subset statement, we can investigate the situation graphically.
Named in honor of the English logician John Venn (1834—1923), a Venn diagram is
constructed as follows: % is depicted as the interior of a rectangle, while subsets of U
are represented by the interiors of circles and other closed curves. Figure 3.6 shows four
Venn diagrams. The (blue) shaded region in Fig. 3.6(a) represents the set A, whereas A is
represented by the unshaded area. The shaded region in Fig. 3.6(b) comprises A U B; the
set A M B is represented by the shaded region in Fig. 3.6(c). The Venn diagram for A — B
is given in part (d) of this figure.
In Fig. 3.7 Venn diagrams are used to establish the second of DeMorgan’s Laws. Figure
3.7(a) has everything except AM B shaded, so the shaded portion represents A 1 B. We
now develop a Venn diagram to depict A U B. In Fig. 3.7(b), A is the shaded region (outside
the circle representing set A). Likewise, B is the shaded region shown in Fig. 3.7(c). When
the results from Fig. 3.7(b) and Fig. 3.7(c) are put together, we get the Venn diagram for
their union in Fig. 3.7(d). Since the shaded region in part (d) is the same as that in part (a),
it follows that AN B= AUB.
142 Chapter 3 Set Theory
O)
JE Ae
(a) (b)
U AL
Sle = (c)
Figure 3 6
U U
(d)
(a) (b)
(c) (d)
Figure 3.7
We further illustrate the use of these diagrams by showing that for any sets A, B, C CU,
(AUB)NC=(ANB)UC.
Instead of shading regions, another approach that also uses Venn diagrams numbers the
regions as shown in Fig. 3.8 where, for example, region 3 is AM BMC and region 7 is
AN BNC. Each region is a set of the form S$; M S21 $3, where S, is replaced by A or
A, Sz by B or B, and S3 by C or C. Consequently, by the rule of product, there are eight
possible regions.
Consulting Fig. 3.8, we see that A U B comprises regions 2, 3, 5, 6, 7, 8 and that regions
4,6, 7, 8 make up set C. Therefore (A U B) MC comprises the regions common to A U B
3.2 Set Operations and the Laws of Set Theory 143
iS
CN
and C: namely, regions 6, 7, 8. Consequently, (A U B) M C is made up of regions 1, 2, 3, 4,
5. The set A consists of regions 1, 3, 4, 6, while regions 1, 2, 4, 7 make up B. Consequently,
A‘ B comprises regions | and 4. Since regions 4, 6, 7, 8 comprise C, the set C is made
up of regions 1, 2, 3, 5. Taking the union of A M B with C, we then finish with regions 1,
2, 3, 4,5, as we did for (AU B) NC.
One more technique for establishing set equalities is the membership table. (This method
is akin to using the truth table introduced in Section 2.1.)
We observe that for sets A, B CU, an element x € U satisfies exactly one of the fol-
lowing four situations:
a) x¢A,x¢B b) x EA,
xX EB.
c)xE€A,x¢B d)xeA,xeB.,
When x is an element of a given set, we write a | in the column representing that set in
the membership table; when x is not in the set, we enter a 0. Table 3.2 gives the membership
tables for AM B, A UB, A in this notation. Here, for example, the third row in part (a) of
the table tells us that when an element x € YU is in set A but not in B, then itis notin A B
but itis in A U B.
Table 3.2
A|B!lANB|AUB A|A
0 | 0 0 0 ]
0 ] 0 1 1 0
1 0 0 ]
1 1 1 ]
(a) (b)
These binary operations on 0 and 1 are the same as in ordinary arithmetic (relative to -
and +) except that 1 U1] = 1.
Using membership tables, we can establish the equality of two sets by comparing their
respective columns in the table. Table 3.3 demonstrates this for the Distributive Law of
union over intersection. We see here how each of the eight rows corresponds with exactly
one of the eight regionsin the Venn diagram of Fig. 3.8. For example, row | corresponds
with region 1: AM BC; and row 6 corresponds with region 7: AN BOC.
144 Chapter 3 Set Theory
Table 3.3
A|B|C |] BNC} AU(BNC) | AUB {AUC | (AUB)N(AUC)
0; 0] 0 0 0 0 0 0
0|;0] 1 0 0 0 1 0
Oo; 14] 0 0 0 ] 0 0
0] 1 1 1 1 1 1 ]
1} 0; 0 0 ] ] 1 l
1/0); 1 0 ] ] 1 ]
l 1 | 0 0 ] ] ] ]
1 1 ] l ] ] l 1
t ft
Since these columns are identical, we conclude
that AU(BNC)=(AUB)N(AUC),
Before we continue let us make two points. (1) A Venn diagram is simply a graphical
representation of a membership table. (2) The use of Venn diagrams and/or membership
tables may be appealing, especially to the reader who presently does not appreciate writing
proofs. However, neither one of these techniques specifies the logic and reasoning displayed
in the element arguments we presented, for instance, to prove that for any A, B, C CU,
AUB=ANB, and AN(BUC)=(ANB)U(ANC).
We feel that Venn diagrams may help us to understand certain mathematical situations
— but when the number of sets involved exceeds three, the diagram could be difficult to
draw.
In summary, let us agree that the element argument (especially with its detailed explana-
tions) is more rigorous than these two techniques and is the preferred method for proving
results in set theory.
Now that we have the laws of set theory, what can we do with them? The following
examples will demonstrate how the laws are used to simplify a complicated set expression
or to derive new set equalities. (When more than one law is used in a given step, we list the
principal law as the reason.)
EXAMPLE 3.20 Simplify the expression (A U B) VC UB.
(AUB)NCUB Reasons
=((AUB)NC)N B DeMorgan’s Law
=((AUB)NC)NB Law of Double Complement
=(AUB)N(CNB) Associative Law of Intersection
=(AUB)N(BNC) Commutative Law of Intersection
=[(AUB)N BINC Associative Law of Intersection
=BOC Absorption Law
The reader should note the similarity between the steps and reasons in this example and
those for simplifying the statement
[“[(p Vg) Ar] Vv 7q]
3.2 Set Operations and the Laws of Set Theory 145
to the statement
qgAr
in Example 2.17.
Express A — B interms of Uand .
L EXAMPLE 5.21 From the definition of relative complement, A — B = {x|x Ee AAxE BJ =AN B.
Therefore,
B ANB Reasons
tl tl >
by doll I
m| >| |
U DeMorgan’s Law
U Law of Double Complement
From the observation made in Example 3.21, we have A A B =
EXAMPLE 3.22 {x]x € AUBAx
¢ ANB} = (AUB)
— (AN B) = (AUB) N(ANB), so
AAB=(AUB)N(ANB) Reasons
= (AU B)U(AN B) DeMorgan’s Law
= (AU B)U(ANB) Law of Double Complement
=(ANB)U(AUB) Commutative Law of U
=(ANB)U(ANB) DeMorgan’s Law
=[(AN B)UA]N[(AN B) UB] Distributive Law of U over M
=[(AUA)N(BUA)] N[(AU B) N(BUB)] Distributive Law of U over N
=(UN(BUA)]N[(AU BN" Inverse Law
=(BUA)N (A U B) Identity Law
=(AUB)N(AU B) Commutative Law of U
=(AUB)N(ANB) DeMorgan’s Law
=AAB
=(AUB)N(AUB) Commutative Law of N
=(AUB)N(ANB) DeMorgan’s Law
=AAB
In closing this section we extend the set operations of U and M beyond three sets.
Definition 3.10 Let J be a nonempty set and U a universe. For eachi € J let A; CU. Then J is called an
index set (or set of indices), and eachi € / is called an index.
Under these conditions,
U A, = {x|x € A, for at least onei € J}, and
t€
O) A, = {x|x
€ A; foreveryie
TI}.
ief
146 Chapter 3 Set Theory
We can rephrase Definition 3.10 by using quantifiers:
x€U A re
Fie le Ai) x€f) Aj <> Wie I(x € Ai)
rE
Then x ¢ UierAi <=} — [di € I(x € A;)] > Wi € I € Aj); that is, x ¢ U;<;A; if and
only if x ¢ A; for every index i € J. Similarly, x ¢N,<;A; SB A[Wi El (x EAD] S&S
di € I(x ¢ A;); that is, x ¢ M;<,A; if and only ifx ¢ A; for at least one indexi € I.
If the index set / is the set Z*, we can write
_)8
A,, () A; =A;NAIN:--=
C8
LU A; =A, UA.U--- = A;.
ieZt i ieZt i
(I
I
| EXAMPLE 3.23 Let
Ure
J = {3,4,5, 6,7},
Aj = U!_3 A; = {1,
and
2, 3,
for each
seg 7} =
ie/
Aj,
let A; = {1,2,3,...,i}
whereas Nie, A; = {1, 2, 3}
CU=Z*.
= A3.
Then
Let U=R and J =R". If for each re R*, A, =[-r,r], then U,c;A, =R and
| EXAMPLE 3.24
Nrer Ar = {0}.
When dealing with generalized unions and intersections, membership tables and Venn
diagrams are unfortunately next to useless, but the rigorous element approach, as demon-
strated in the first part of the proof of Theorem 3.3, is still available.
THEOREM 3.6 Generalized DeMorgan’s Laws. Let I be an index set where for each i € J, A; CU. Then
a) U A; = NA; b) 1) A; = UA,
iel ie! iel iel
Proof: We shall prove Theorem 3.6(a) and leave the proof of part (b) for the reader. For each
x EU, x € Ujes Aj <> x €U:-;A; <> x ¢ Aj, for alli e] ex € Aj, for allieloo
xE Nye7 Aj.
3. a) Determine the sets A, B where A — B = {1, 3, 7, 11},
ees 4 = 0,6, 8), and Ane = 14,9)
1. For W={1,2,3,...,9, 10} let A = fl, 2, 3, 4, 5}, b) Determine
— the sets C, D7 where C — D = (1, 2, 4},
B= {1,2,4,8},C=(1,2,3,5,7}, and D={2,4,6,8). D—C = {7, 8}, and C U D = {1, 2, 4, 5, 7, 8, 9}.
Determine each of the following: 4. Let A, B, C, D, E C Z be defined as follows:
a) (A UB) ne b) AU(BNC) A = {2n|n € Z}
— that is, A is the set of all (integer)
c) CUD d) CND multiples of 2;
e) (AUB)—C f) AU(B—C) B = {3n|n €Z); C = {4n|n
€ Z};
g) (B-—C)-D h) B—(C—D)
D= {6n|n € Z}; and E = {8n|n € Z}.
i) (AUB)-(CND)
2. If A = [0, 3], B = [2, 7), with U = R, determine each of a) Which of the following statements are true and which
the following: are false?
a) ANB b) AUB ) ECCCA ii) ACCCE
c) A d) AAB iii) BCD iv) DCB
e) A-B f) B-A vy) DCA vi) DCA
3.2 Set Operations and the Laws of Set Theory 147
b) Determine each of the following sets. b) P(A NB) = P(A) NP(B)
I CNE ii) BUD iii) ANB 14, Use membership tables to establish each of the following:
iv) BND v) A vi) ANE a) ANB=AUB b) AUA=A
5. Determine which of the following statements are true and ec) AU(ANB)=A
which are false.
d) (AN B)U(ANC) =(ANB)U(ANC)
a) Z*< Qt b) Z7> CQ
15. a) How many rows are needed to construct the membership
c) Q7 CR d) R*cQ
table for AN (BUC)N(DUEUF)?
e) Qt NR* = Qt f) Z* UR* = R*
b) How many rows are needed to construct the member-
g) R*NC= Rt h) CUR=R ship table for a set made up from the sets A,, Az, ..., An,
i) ONZ=Z using N, U, and ?
6. Prove each of the following results without using Venn c) Given the membership tables for two sets A, B, how
diagrams or membership tables. (Assume a universe UL.) can the relation A C B be recognized?
a) If AC BandC CD,then ANC C BMD and d) Use membership tables to determine whether or not
AUCCBUD. (AN B)U(BNO)DAUB.
b) AC Bifand only if ANB = @. 16. Provide the justifications (selected from the laws of set
c) ACB ifand only if AU B = YU. theory) for the steps that are needed to simplify the set
7. Prove or disprove each of the following: (AN B)U[BN(CN D)U(CND))I,
a) Forsets A,B, CCU,ANC=BNCSA=B. where A, B,C, DCU.
b) ForsetsA,B,CCU,AUC=BUCSA=B.
Steps Reasons
c) Forsets A, B,C CU, (AN B)U[BN (CN D)U(CND))|
[ANC =BNOC)AAVUC=BU0O)JSA=B. =(ANB)V[BN(CN(DUD))]
d) Forsets, A,B, CCU,AAC=BACSA=B. =(ANB)VU[BN(CNW)]
8. Using Venn diagrams, investigate the truth or falsity of each =(AN B)U(BNC)
of the following, for sets A, B, C CU. =(BNA)U(BNC)
=BN(AUC)
a) AA(BNC)=(AAB)N(AAC)
17. Using the laws of set theory, simplify each of the following:
b) A— (BUC)
=(A- B)N(A—-C)
a) AN(B
— A)
c) AA(BAC)=(AAB)AC
b) (ANB) U(ANBNEND)U(ANB)
9. IfA = {a, b, d}, B = {d, x, y],andC = {x, z}, howmany
proper subsets are there for the set (AN B) UC? How many c) (A- B)U(ANB)
for the set AM (B UC)? d) AUBU(ANBNC)
10. For a given universal set U, each subset A of U satisfies
18. For each
n € Z* let A, = {1,2,3,...,"- I,m}. (Here
the idempotent laws of union and intersection. (a) Are there any
UW = Z* and the index set J = Z*.) Determine
real numbers that satisfy an idempotent property for addition?
7 1 m
(That is, can we find any real number(s) x such thatx + x = x?)
U An ’ a) An 3 U A, 3 and f) An,
(b) Answer part (a) upon replacing addition by multiplication. n=) n=) a=l n=l
11. Write the dual statement for each of the following set- where m is a fixed positive integer.
theoretic results. 19. Let % = R and let 7 = Z*. For each n € Z* let A, =
a) %W=(ANB)U(ANB)U(AN B)U(ANB) [—2n, 3n]. Determine each of the following:
b) A=AN(AUB) a) Aj b) Ag
c) AUB = (AN B)U(ANB)U(ANB) c) A3 — Aq d) A; A Ag
d) A=(AUB)N(AUD) 7 7
e) Ua, f) (VA,
12. LetA, B CU. UsetheequivalenceA CB > ANB=A
to show that the dual statement of A C Bisthe statement B C A.
g) neZt
UA, h) n=l
1A,
13. Prove or disprove each of the following for sets A, B CU.
a) P(A U B) = P(A) UP(B) 20. Provide the details for the proof of Theorem 3.6(b).
148 Chapter 3 Set Theory
3.3
Counting and Venn Diagrams
With all of the theoretical work and theorem proving we did in the last section, now is a
good time to examine some additional counting problems.
For sets A, B from a finite universe U, the following Venn diagrams will help us obtain
counting formulas for |A| and |A U B| in terms of ||, |A], |B], and |AM BI.
As Fig. 3.9 demonstrates, AU A = Wand AN A = G,so by the rule of sum, |A| + |A| =
|U| or |A| = |U| — |A]. The sets A, B, in Fig. 3.10, have empty intersection, so here the
rule of sum leads us to |A U B| = |A| + |B| and necessitates that A, B be finite but does
not require any condition on the cardinality of U.
U Ut
QB)
Figure 3.9 Figure 3.10
Turning to the case where A, B are not disjoint, we motivate the formula for |A U B|
with the following example.
In a class of 50 college freshmen, 30 are studying C++, 25 are studying Java, and 10 are
EXAMPLE 3.25
studying both languages. How many freshmen are studying either computer language?
We let °tt be the class of 50 freshmen, A the subset of those students studying C++, and
B the subset of those studying Java. To answer the question, we need |A U B]. In Fig. 3.11
the numbers in the regions are obtained from the given information: |A| = 30, |B| = 25,
|A 1 B| = 10. Consequently, |A U B| = 45 # 55 = 30+ 25 = |A| + |B], because |A| +
|B| counts the students in AM B twice. To remedy this overcount, we subtract |A M B|
from |A| + |B| to obtain the correct formula: |A U B| = |A| +|B| —|AM BI.
Ove
Figure 3.11
If A and B are finite sets, then [A U Bi = |A| + |B] — {AM BI. Consequently, finite
sets A and B are (mutually) disjoint if and only if {AU B} = |A/+ Bl. -
In addition, when UL is finite, from DeMorgan’s Law we have [AM B] = {AU B| =
[UU] —|AU Bi = {UL} — [A] ~{BI+ {AN BI.
3.3 Counting and Venn Diagrams 149
This situation extends to three sets, as the following example illustrates.
An AND gate in an ASIC (Application Specific Integrated Circuit) has two inputs: I,, In,
EXAMPLE 3.26
and one output: O. (See Fig. 3.12). Such an AND gate can have any or all of the following
defects:
D;: The input I; is stuck at 0.
D2: The input I, is stuck at 0.
D3: The output O is stuck at 1.
OU
I, A B
(s\
43
Cc
0
Figure 3.12 Figure 3.13
For a sample of 100 such gates we let A, B, and C be the subsets (of these 100 gates) hav-
ing defects D,, D2, and D3, respectively. With |A| = 23, |B| = 26, |C| = 30, |AN B| =
7, |ANC| = 8, |B C| = 10, and |AN BN C| = 3, how many gates in the sample have
at least one of the defects D;, D>, D3?
Working backward from |A M1 BM C| = 3 to |A| = 23, we label the regions as shown in
Fig. 3.13 and find that |A U BUC| =|A|+|B]/+|C|—|AN BJ —|ANC|—-|BNC]+
IAN BNC| = 23+ 26+ 30-7 -—8 —10+3 = 57. Thus the sample contains 57 AND
gates with at least one of the defects and 100 — 57 = 43 AND gates with no defect.
if A, B, C are finite sets, then |A U BUC] = |A]+/B] + [Cl—-|AN B] -—|ANC|—
IBONCI+FIANBNC.
From the formula for |A U B U C} and DeMorgan’s Law, we find that if the universe
UW is finite, then [AN BNAC|={|AUBUC|
= [U] —|AUBUC] = |U] — JA} —
[BE ~IC/ + {AN BI+JANC|+IBNC|-jJAN
BNC.
We close this section with a problem that uses this last result.
A student visits an arcade each day after school and plays one game of either Laser Man,
EXAMPLE 3.27
Millipede, or Space Conquerors. In how many ways can he play one game each day so that
he plays each of the three types at least once during a given school week?
Here there is a slight twist. The set U consists of all arrangements of size 5 taken from
the set of three games, with repetitions allowed. The set A represents the subset of all
sequences of five games played during the week without playing Laser Man. The sets B
and C are defined similarly, leaving out Millipede and Space Conquerors, respectively.
The enumeration techniques of Chapter 1 give |U| = 3°, |A] = |B] =|C| = 25, |AN B] =
150 Chapter 3 Set Theory
JAN C)=|BOC|= 1° = 1land|AN BNC| =0, so by the preceding formula there are
IAN BNC| =3°? —3-2°+3- 15 —0 = 150 ways the student can select his daily games
during a school week and play each type of game at least once.
This example can be expressed in an equivalent distribution form, since we are seeking
the number of ways to distribute five distinct objects (Monday, Tuesday, . .., Friday) among
three distinct containers (the computer games) with no container left empty. More will be
said about this in Chapter 5.
The following data are the numbers of books that contain ma-
terial on these topics:
1. During freshman orientation at a small liberal arts college, |A| =8 |B] = 13 IC|= 13
two showings of the latest James Bond movie were presented. IAN B|=5 JANC]=3 |BNC|=6
Among the 600 freshmen, 80 attended the first showing and 125
IANBNC|=2
attended the second showing, while 450 didn’t make it to either
showing. How many of the 600 freshmen attended twice? (a) How many of the textbooks include material on exactly one
of these topics? (b) How many do not deal with any of the
2. A manufacturer of 2000 automobile batteries is concerned
topics? (c) How many have no material on compilers?
about defective terminals and defective plates. If 1920 of her
batteries have neither defect, 60 have defective plates, and 20 7. How many permutations of the 26 different letters of the
have both defects, how many batteries have defective terminals? alphabet contain (a) either the pattern “OUT” or the pattern
“DIG’’? (b) neither the pattern “MAN” nor the pattern “ANT”?
3. A binary string of length 12 is made up of 12 bits (that is,
12 symbols, each of which is a0 ora 1). How many such strings 8. A six-character variable name in a certain version of ANSI
either start with three 1’s or end in four 0’s? FORTRAN starts with a letter of the alphabet. Each of the other
five characters can be either a letter or a digit. (Repetitions are
4. Determine |A UU BUC] when |A| = 50, |B! = 500, and
allowed.) How many six-character variable names contain the
|C| = 5000, if(a@ AC BCC (b)ANB=ANC=BNC=
pattern “FUN” or the pattern “TIP”?
%; and(c) |AN B| = |ANC|=|BNOC| =3 and
IAN BNAC|=L. 9. How many arrangements of the letters in MISCELLA-
NEOUS have no pair of consecutive identical letters?
5. How many permutations of the digits 0, 1, 2,..., 9 either
start with a 3 or end with a 7? 10. How many arrangements of the letters in CHEMIST have
H before E, or E before T, or T before M? (Here “before” means
6. A professor has two dozen introductory textbooks on com-
anywhere before, not just immediately before.)
puter science and is concerned about their coverage of the topics
(A) compilers, (8) data structures, and (C) operating systems.
3.4
A First Word on Probability
When one performs an experiment such as tossing a single fair coin, rolling a single fair
die, or selecting two students at random from a class of 20 to work on a project, a set of all
possible outcomes for each situation is called a sample space. Consequently, {H, T} serves
as a sample space for the first experiment mentioned and {1, 2, 3, 4, 5, 6} is a sample space
for the roll of a single fair die. Moreover, {{a;, a;}|1 <i <20, 1 < j <20,i 4 j} can be
used for the last experiment, with a; denoting the ith student, for each 1 <i < 20.
In dealing with the sample space & = {1, 2, 3, 4, 5, 6} for the roll of a single fair die, we
feel that each of the six possible outcomes has the same, or equal, likelihood of occurrence.
Using this assumption of equal likelihood, we shall start our study of probability theory with
a definition for probability that was first given by the French mathematician Pierre-Simon
de Laplace (1749-1827) in his Analytic Theory of Probability.
3.4 A First Word on Probability 151
Under the assumption of equal likelihood, let ¥ be the sample space for an experiment
€. Each subset A of Y, including the empty subset, is called an event. Each element of
f determines an outcome, so if |F| =n andae SF, ACY, then
Pr({a}) = The probability that {a} (or, a) occurs = tah = A and
Pr{A) = The probability thatA eccurs = ist = iat
[Note: We often write Pr(a) for Pr({a}).]
We demonstrate these ideas in the following four examples.
When Daphne tosses a fair coin, what is the probability she gets a head? Here the sample
EXAMPLE 3.28
space F = {H, T} with A = {H} and we find that
|A| 1
Pr(A) = 2
If Dillon rolls a fair die, what is the probability he gets (a) a 5 or a 6, (b) an even number?
EXAMPLE 3.29
For either part the sample space& = {1, 2, 3, 4, 5, 6}. In part (a) we have event A = {5, 6}
and Pr(A) = lal — 2 = i, For part (b) we consider event B = (2, 4, 6} and find that
BI =e"i = 3
Pr(B)= :
Furthermore we also notice here that
i) Pr(¥) = Fl = 8 = ] — after all, the occurrence of the event & is a certainty; and
There are 20 students enrolled in Mrs. Arnold’s fourth-grade class. Hence, if she wants to
EXAMPLE 3.30
select two of her students, at random, to take care of the class rabbit, she may make her
selection in (4) = 190 ways, so |F| = 190.
Now suppose that Kyle and Kody are two of the 20 students in the class and we let A
be the event that Kyle is one of the students selected and B be the event that the selection
includes Kody. Consequently, upon choosing the students, at random, the probability that
Mrs. Arnold selects
a) both Kyle and Kody is Pr(A 9 B) = (3)/(2) = 1/190;
b) neither Kyle nor Kody is Pr(A M B) = ('3)/(?) = 153/190;
c) Kyle but not Kody is Pr(A B) = (1) ('8)/(@) = 18/190 = 9/95.
Consider drawing five cards from a standard deck of 52 cards. This can be done in (°?) =
EXAMPLE 3.31
2,598,960 ways. Now suppose that Tanya draws five cards, at random, from a standard
deck. What is the probability she gets (a) three aces and two jacks; (b) three aces and a pair;
(c) a full house (that is, three of one kind and a pair)?
152 Chapter 3 Set Theory
In all three cases we have |¥| = 2,598,960.
a) There are (3) = 4 ways in which one can select three aces and (3) = 6 ways in which
two jacks may be selected. Consequently, if A is the event where Tanya draws three
aces and two jacks, then |A| = (3)($) =4-6 = 24 and Pr(A) = 24/2,598,960 =
0.000009234.
b) Once again there are (3) = 4 ways to select the aces, and there are (5) = 6 ways to
select a pair of deuces, or a pair of threes, .. ., or a pair of tens, or a pair of jacks, ...,
or a pair of kings. So the pair can be selected in ('7)(3) = 12-6 = 72 ways. If B is
the event where three aces and a pair are drawn, then Pr(B) = (4- 72)/2,598,960 =
0.0001108 14.
c) From part (b) we know there are 4-72 = 288 full houses with three aces. Likewise,
there are 288 full houses with three deuces, 288 with three threes, ..., and 288 with
three kings. So the probability that Tanya draws a full house is (‘?) (3) (7) (5) / (2) =
3744/2,598,960 = 0.001440576.
If these three probabilities appear on the slim side, consider the chances of Tanya drawing a
royal flush — that is, the ten, jack, queen, king, and ace of one given suit. For this five-card
hand the probability is only 4/(?) = 4/2,598,960 = 0.000001539.
To study some additional sample spaces we need to introduce the idea of the ordered
pair. This arises in the following structure.
Definition 3.11 For sets A, B, the Cartesian product, or cross product, of A and B is denoted by A X B
and equals {(a, b)|a € A, b€ B}.
We call the elements of A <X B ordered pairs. For (a, b), (c,d) € A X B, we have
(a, b) = (c, d) if and only ifa =candb=d."
If A={1,2,3} and B= {x, y}, then A X B= {(1, x), , y), 2, x), 2, y), GB, x),
EXAMPLE 3.32
(3, y)} while B X A = {(x, 1), (x, 2), (x, 3), Gy, 1), Gy, 2), (vy, 3)}. Here 1, x) € AX B
but (1,x) ¢ B X A, although (x, 1)—€ BX A. So AXB#BXA, but |A X B| =6=
2-3 =|A||B| = |B||A| = |B xX Al.
Now let us see how the Cartesian product can arise in a probability problem.
Suppose Concetta rolls two fair dice. This experiment can be decomposed as follows. Let €,
EXAMPLE 3.33
be the experiment where the first die is rolled — with sample space ¥; = {1, 2, 3, 4, 5, 6}.
Likewise we let €2 account for the second die rolled—also with sample space So =
{1, 2, 3, 4, 5, 6}. (To keep the two dice distinct we can imagine the first die rolled with the
left hand and the second with the right. Or we can have the first die colored red and the
More about ordered pairs and the Cartesian product is given in Section 5.1.
3.4 A First Word on Probability 153
second green — in order to distinguish them.) Consequently, when Concetta rolls these dice
the sample space
F=f) xX F2 = {C, 1), C1, 2), 0, 3), 0, 4), (1, 5), Ud, 6), (2, 1), (2, 2), (2, 3), 2,4),
(2, 5), (2, 6), (3, 1), G, 2), (3, 3), G3, 4), 3, 5), 3, 6), 4 1D, (4, 2),
(4, 3), (4, 4), (4, 5), (4, 6), (5, D, (5, 2), (5, 3), (5, 4), (5, 5), 5, 8),
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
= {(x, y)|x, y = 1, 2, 3, 4, 5, 6}.
Now consider the following events:
A: Concetta rolls a 6 (that is, the top faces of the dice sum to 6);
B: The sum of the dice is at least 7;
C: Concetta rolls an even sum; and
D: The sum of the dice is 6 or less.
a) Here
i) A= {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} with Pr(A) = |Al/|P| = 5/36;
ii) B= {(1, 6), (2, 5), 3, 4), (4, 3), GS, 2), 6, 1), (2, 6), 3.5). (44),
(5, 3), (6, 2), (3, 6), (4, 5), (5, 4), (6, 3), (4, 6), (5, 5), (6, 4), (5, 6),
(6, 5), (6, 6)} = {(x, y)|x, y = 1, 2, 3,4, 5, 6; x + y > 7} with
Pr(B) = |B /|P| = 21/36 = 7/12;
iii) C = {(, 1), C1, 3), (2, 2), 3, 1), C1, 5), (2, 4), GB, 3), (4, 2), (5, 1),
(2, 6), (3, 5), (4, 4), (5, 3), (6, 2), (4, 6), (5, 5), (6, 4), (6, 6)} with
Pr(C) = |C|/|F| = 18/36 = 1/2; and
iv) D={(1, 1), C1, 2), (2, 1), (1, 3), (2, 2), (3, 1), C1, 4), (2, 3), (3, 2), (4, 1),
(1, 5), 2, 4), (3, 3), (4, 2), (5, 1)} with Pr(D) = |DI/|¥| = 15/36 = 5/12.
b) We notice the following:
i) AUB ={(x, y)|x,
y = 1, 2, 3, 4, 5, 6; x + y > 6}, so |A U B| = 26 and
Pr(AU B)=|AUBI/|F| = 8 = 3 + 2 = Pr(A) + Pr(B);
ii) CUD = {(1, 1), (1, 2), (2, 1), , 3), (2, 2), 3, 1), C1, 4), (2, 3), (3, 2),
(4, 1), , 5), 2, 4), G, 3), 4, 2), G, 1D, (2, 6), (3, 5), (4, 4). 5, 3),
(6, 2), (4, 6), (5, 5), (6, 4), (6, 6)} so |C U D| = 24 and Pr(C UD) =
IC U DI /|P| = 24/36 = 2/3.
Here, however,
Pr(C U D) = 24/36 F 33/36 = 18/36 + 15/36 = Pr(C) + Pr(D), although
Pr(C UD) = 24/36 = 18/36 + 15/36 — 9/36 = Pr(C) + Pr(D) — P(C ND).
The result here and that in part (i) [of (b)] mirror the ideas we saw earlier in the formulas
following Example 3.25.
iii) Finally, Pr(B) = Pr(D) = 15/36 = 1 — 21/36 = 1 — Pr(B).
Let us consider a second example where the Cartesian product is used. This time we’ll
also learn about another important structure.
An experiment © is conducted as follows: A single die is rolled and its outcome noted, and
EXAMPLE 3.34 a, ; ;
then a coin is flipped and its outcome noted. Determine a sample space & for &.
154 Chapter 3 Set Theory
Let ©, denote the first part of experiment ©, and let , = {1, 2, 3, 4, 5, 6} be a sample
space for €;. Likewise let 2 = {H, T} be a sample space for €2, the second part of the
experiment. Then * = Y| X F2 is a sample space for ©.
This sample space can be represented pictorially with a tree diagram that exhibits all
the possible outcomes of experiment €. In Fig. 3.14 we have such a tree diagram, which
proceeds from left to right. From the left-most endpoint, six branches originate for the six
outcomes of the first stage of the experiment €. From each point, numbered 1, 2, ..., 6,
two branches indicate the subsequent outcomes for tossing the coin. The 12 ordered pairs
at the right endpoints constitute the sample space &.
(2, T)
(3, H)
—<
3.1
<—
“"
(4, T)
—<
“
Figure 3.14
Now for this experiment € consider the events
A: Ahead appears when the coin is tossed.
B: A3 appears when the die is rolled.
Then A = {(1, H), (2, H), (3, H), (4, H), (5, H), (6, H)} and B = ((3, H), (3, T)}. So
Pr(A) =|Al/|¥| = 6/12 = 1/2, Pr(B) = |B|/|¥| = 2/12 = 1/6, and
7 6 2 ]
P(AUB) = 75 = 35 + a5 7 Ug = PIA) + Pr(B) — Pr(A 0 B).
Before we continue let us look back at Examples 3.33 and 3.34. We may not realize
it, but we have been making a certain assumption. In Example 3.33 we assumed that the
outcome for the first die had no influence on the outcome for the second die. Likewise, in
Example 3.34 we assumed that the outcome for the die had no bearing on the outcome for
the coin. This concept of independence will be examined more closely in Section 3.6.
In our next example we extend the idea of the Cartesian (or, cross) product to more than
two sets.
If Charles tosses a fair coin four times, what is the probability that he gets two heads and
EXAMPLE 3.35
two tails?
Here the sample space for the first toss is ?; = {H, T}. Likewise, for the second, third, and
fourth tosses, we have S23 = £3 = L4 = {H, T}. So, for this experiment of tossing a fair coin
3.4 A First Word on Probability 155
four times, we have the sample space ¥ = Y) X Fz X 3 X Fy, where a typical element of
# is an ordered quadruple. For example, one such ordered quadruple is (H, T, T, T) (which
may also be denoted HTTT). In this problem |¥| = |P;||P2||F3||Pa| = 27 = 16. The event
A we are concerned about contains all arrangements of H, H, T, T, so |A| = 4!/(2! 2!) = 6.
Consequently, Pr(A) = |A]/|P| = 6/16 = 3/8.
(Comparable to Examples 3.33 and 3.34, here the result of each toss is independent of
the outcome of any previous toss.)
The next example also requires some of the formulas developed in Chapter 1 for ar-
rangements.
The acronym WYSIWYG (for, What you see is what you get!) is used to describe a user-
EXAMPLE 3.36
interface. This user-interface presents material on a VDT (Video-display terminal) in pre-
cisely the same format the material appears on hard copy.
There are 7!/(2!2!) = 1260 ways in which the letters in the acronym WYSIWYG can
be arranged. Of these, 120(= 5!) arrangements have both consecutive W’s and consecutive
Y’s. Consequently, if the letters for this acronym are arranged in a random manner, then we
find the probability that the arrangement has both consecutive W’s and consecutive Y’s is
120/1260 = 0.0952.
The probability that a random arrangement of these seven letters starts and ends with the
letter W is [(5!/2!)]/[(7!/(2! 2!))] = 60/1260 = 0.0476.
In our final example we shall use the concept of a Venn diagram.
In a survey of 120 passengers, an airline found that 48 enjoyed wine with their meals,
EXAMPLE 3.37
78 enjoyed mixed drinks, and 66 enjoyed iced tea. In addition, 36 enjoyed any given pair
of these beverages and 24 passengers enjoyed them all. If two passengers are selected at
random from the survey sample of 120, what is the probability that
a) (Event A) they both want only iced tea with their meals?
b) (Event B) they both enjoy exactly two of the three beverage offerings?
From the information provided, we construct the Venn diagram shown in Fig. 3.15. The
sample space & consists of the pairs of passengers we can select from the sample of 120, so
\F| = (13°) = 7140. The Venn diagram indicates that there are 18 passengers who drink only
iced tea, so |A| = ('3) and Pr(A) = 51/2380. The reader should verify that Pr(B) = 3/34.
i.
(SN)
Figure 3.15
156 Chapter 3 Set Theory
(c) the second smallest number drawn is 5 and the fourth largest
19 (3 Gh AE number drawn is 15?
1. The sample space for an experiment is & = {a, b,c, 11. Darci rolls a fair die three times. What is the probability that
d,e, f, g, h}, where each outcome is equally likely. If event (a) her second and third rolls are both larger than her first roll?
A = {a, b, c} andevent B = {a, c, e, g}, determine (a) Pr(A); (b) the result of her second roll is greater than that of her first
(b) Pr(B); (c) Pr(ANB); (d) Pr(AUB); (e) Pr(A); roll and the result of her third roll is greater than the second?
(f) Pr(A U B); and (g) Pr(A NB). 12. In selecting a new server for its computing center, a col-
lege examines 15 different models, paying attention to the
2. Joshua draws two ping-pong balls from a bowl of twenty
following considerations: (A) cartridge tape drive, (B) DVD
ping-pong balls numbered | to 20. Provide a sample space for
Burner, and (C) SCSI RAID Array (a type of failure-tolerant
this experiment if
disk-storage device). The numbers of servers with any or all of
a) the first ball drawn is replaced before the second ball is these features are as follows: |A| = |B| = |C| = 6, |AN B| =
drawn. IBONC|=1, |ANC| = 2, |ANBNC| = 0. (a) How many
of
b) the first ball drawn is not replaced before the second ball the models have exactly one of the features being considered?
is drawn. (b) How many have none of the features? (c) If a model is se-
3. Asample space & (for an experiment
€) contains 25 equally lected at random, what is the probability that it has exactly two
likely outcomes. If an event A (for this experiment @) is such of these features?
that Pr(A) = 0.24, how many outcomes are there in A? 13. At the Gamma Kappa Phi sorority the 15 sisters who are se-
4. Asample space & (for an experiment €) contains n equally niors line up in a random manner for a graduation picture. Two
likely outcomes. [f an event A (for this experiment @) contains of these sisters are Columba and Piret. What is the probability
7 of these outcomes and Pr(A) = 0.14, what is n?
that this graduation picture will find (a) Piret at the center po-
sition in the line? (b) Piret and Columba standing next to each
5. The Tuesday night dance club is made up of six married other? (c) exactly five sisters standing between Columba and
couples and two of these twelve members must be chosen to Piret?
find a dance hall for an upcoming fund raiser. (a) If the two
14, The freshman class of a private engineering college has
members are selected at random, what is the probability they
300 students. It is known that 180 can program in Java, 120 in
are both women? (b) If Joan and Douglas are one of the couples
Visual BASIC’, 30 in C+4, 12 in Java and C+4, 18 in Visual
in the club, what is the probability at least one of them is among
BASIC and C++, 12 in Java and Visual BASIC, and 6 in all
the two who are chosen?
three languages.
6. If two integers are selected, at random and without replace-
a) A student is selected at random. What is the probability
ment, from {1, 2, 3,..., 99, 100}, what is the probability the
that she can program in exactly two languages?
integers are consecutive?
b) Two students are selected at random. What is the prob-
7. Two integers are selected, at random and without replace- ability that they can (i) both program in Java? (ii) both
ment, from {1, 2, 3,..., 99, 100}. What is the probability their program only in Java?
sum is even?
15. An integer is selected at random from 3 through 17 inclu-
8. If three integers are selected, at random and without re- sive. If A is the event that a number divisible by 3 is chosen
placement, from {1, 2, 3,..., 99, 100}, what is the probability and B is the event that the number exceeds 10, determine
their sum is even? Pr(A), Pr(B), Pr(A QB), and Pr(A U B). How is
9. Jerry tosses a fair coin six times. What is the probability Pr(A UB) related to Pr(A), Pr(B), and Pr(ANM B)?
he gets (a) all heads; (b) one head; (c) two heads; (d) an even 16. a) If the letters in the acronym WYSIWYG are arranged in
number of heads; and (e) at least four heads? a random manner, what is the probability the arrangement
10. Twenty-five slips of paper, numbered 1, 2, 3, ..., 25, are starts and ends with the same letter?
placed in a box. If Amy draws six of these slips, without re- b) What is the probability that a randomly generated ar-
placement, what is the probability that (a) the second smallest rangement of the letters in WYSIWYG has no pair of con-
number drawn is 5? (b) the fourth largest number drawn is 15? secutive identical letters?
Visual BASIC is a trademark of the Microsoft Corporation.
3.5 The Axioms of Probability (Optional) 157
3.5
The Axioms of Probability (Optional)
In Section 3.4 our typical experiment had a sample space where each outcome had the same
likelihood, or probability, of occurrence. If this does not happen, what do we do? Let us
start by considering the following examples.
Suppose Trudy tosses a single coin but it is not fair — for instance, suppose this coin is loaded
EXAMPLE 3.38
to come up heads twice as often as it comes up tails. Here the sample space ¥ = {H, T},
as in Example 3.28, but unlike that example where Pr(H)' = Pr(T), in this situation
we have Pr(H) # Pr(T). With H, T as the only outcomes, we have | = Pr(Y) =
Pr({H} U {T}) = Pr(H) + Pr(T). Since Pr(H)=2Pr(T), it follows that 1=
Pr(H) + Pr(T) = 2Pr(T) + Pr(f), so Pr(T) = 1/3 and Pr(H) = 2/3.
A warehouse contains 10 motors, three of which are defective (D). The other seven are
EXAMPLE 3.39
in good (G) working condition. A first inspector enters the warehouse and selects (and
inspects) one of the motors. For this experiment 6, we have the sample space ; = {D, G}
where Pr(D) = 3/10 and Pr(G) = 7/10. The next day a second inspector enters this same
warehouse and selects (and inspects) a motor. For this second experiment — call it 62 — we
likewise have Sf = {D, G}. But how do we define Pr(D), Pr(G) in this case? The answer
depends on whether the first motor selected remained in the warehouse, or was removed.
g 2 3.2
10 9
=(3)/(19)-
2 2
690 = D3a.3
10 10
9100
3.7 =(3(7)/(19) =21 37 20
10 9 V\1 2 90 10 10 100
4-2 =(NA- 3 7 3 21
10 9 1/1 2 90 10 10 100
° 58 =(MO)-# 5 7.7 Ad
9 10 9 2 2 90 10 10 10 100
(a) Without Replacement (b) With Replacement
Figure 3.16
The tree diagrams in Fig. 3.16 deal with the two possibilities. For part (a) of the figure
consider, for example, the case where the first motor selected is defective (D), with prob-
ability 3/10, and then the second motor selected is also defective (D). Since motors are
not replaced here, when selecting the second motor the inspector is dealing with nine mo-
tors — two defective (D) and seven in good (G) working condition. Hence the probability
of selecting a defective motor here is 2/9, not 3/10. So this situation, as shown by the top
branching, has probability 7 - ¢ = (3)/(12) = & = 4. The comparable case in part (b) of
the figure has probability s . 310 ~~ 100 7
* Recall that when an event consists of a single outcome — say a, we may abbreviate Pr({a}) as Pr(a).
158 Chapter 3 Set Theory
When selecting two motors, either with or without replacement, the sample space is
= {DD, DG, GD, GG} where, for instance, DG is used to abbreviate (D, G). Yet in neither
situation do the outcomes have the same likelihood of occurrence. If the selections are done
without replacement [as in Fig. 3. 16(a)], then Pr(DD) = z. Pr(DG)= ot , Pr(GD) =
a Pr(GG) = =, with &x +2 5 +2 a ++ =1= P(X). When the first motor is replaced
[as in Fig. 3 16(0), we have rb) = im Pr(DG) = a Pr(GD) = fae Pr(GG)=
9 _
joo» With 735 + i995 + ins + io= 1 = Pr).
From this point on we’ll deal exclusively with the case where the two selections are
made without replacement. Consider the following events:
A: One (that is exactly one) motor is defective: {DG, GD};
B: At least one motor is defective: {DG, GD, DD};
C: Both motors are defective: {DD};
E: Both motors are in good working condition: {GG}.
Here
prayer get pg
a2 2h, 6 8
90 90 15 90 90 90 15
Pr(C) = 6 i Pr(E) = a
90 15 90 15
Further, (i)B = E and Pr(B)= Pr(E)=4 =1 - Pr(B); and (ii) AUC = B
Be 5+ 1 = Pr(A) + Pr(C).
AW gh
with AMC = 9%, so Pr(A UC) = Pr(B)
What we did in the latter part of Example 3.39 now motivates our next observation. This
observation extends our earlier results in Section 3.4 where each outcome of the sample
space had the same likelihood, or probability, of occurring.
Let & be the sample space for an experiment 6. Each element a € & is called an outcome,
or elementary event, and we let Pr({a})= Pr(a) denote the probability that this outcome
occurs. Each nonempty subset A of & is still called an event. If event A = {a,, a2..., an},
where a; is an outcome, forall 1 <i <n, then Pr(A) = re Pr(a;). (Note: When A =
we assign Pr(A) = 0, a result we shall actually establish later in this section.)
However, before we get to our axioms of probability, there is a point that needs to be
clarified. We know that when a fair die is rolled, the sample space F = {1, 2, 3, 4, 5, 6},
where each outcome has the same likelihood, or probability, of occurrence — namely, 1/6.
However, if this die is rolled six times we should not expect to see one occurrence of each
of the possible outcomes 1, 2, ... , 6. Should this die be rolled 60 times we want each roll
(after the first) to be unaffected by any previous roll — that is, each roll (after the first) is
to be independent of any previous roll. Further, we cannot expect each of the six possible
outcomes to occur ten times. In fact, if the 1 comes up 20 times and this die is then rolled
60 more times we cannot expect to see | come up 20 times again. So what can we expect?
If, in rolling this fair die n times, the outcome of | occurs m times, then as n grows larger
we expect the relative frequency m/n to approach | /6.
So far this discussion has dealt with a sample space where each outcome has the same
likelihood, or probability, of occurrence. However, the idea is still appropriate if we consider
any sample space — for example, the sample space of Example 3.38. Equally important is
how one can use the idea of relative frequency in modeling an experiment. For suppose we
have a coin that we believe to be biased— perhaps because it is heavier than other similar
3.5 The Axioms of Probability (Optional) 159
coins that we have weighed. In tossing this coin the sample space is S = {H, T}, but how
can we determine Pr(H), Pr(T)? We might toss the coin n times, assuming the outcome
of each toss (after the first) is not affected by any previous outcome. If H comes up m
times, then we can assign Pr(H) = m/n and Pr(T) = (n — m)/n = 1 — (m/n), where the
accuracy of these assigned probabilities improves as n grows larger.
Having addressed the issue of probabilities as relative frequencies, now it is time to
focus on the topic of this section — namely, the axioms of probability. One should find these
axioms rather intuitive, especially when we look back at some of the results in Example
3.29 and part (b) of Example 3.33. The axioms were first introduced in 1933 by Andrei
Kolmogorov and they apply to the case when the sample space & is finite.
The Axioms of Probability
Let ¥ be the sample space for an experiment %. ff A, B are any events
— that is,
0 A, B C¥ (so we now allow the empty set to be an évent), then
1} Pr(A)>0
2) Pr(f) =1
3) if A, B are disjoint (or, mutually disjoint) then Pr(A U B) = Pr(A) + Pr(B).'
Using these axioms we shall now establish a number of applicable results.
THEOREM 3.7 The Rule of Complement. Let £ be the sample space for an experiment ©. If A is an event
(that is, A C F), then
Pr(A) = 1 — Pr(A).
Proof: We know that # =AU A with AN A= %. So from axioms (2) and (3) it follows that
1 = Pr(¥Y) = Pr(A UA) = Pr(A) + Pr(A), and Pr(A) = 1 — Pr{A).
Note that when A = @ in Theorem 3.7 we have 1 = Pr(¥) = Pr(A) = 1— Pr(A), so
Pr(@) = Pr(A) = 0, in agreement with our earlier assignment.
The result of Theorem 3.7 can help cut down on our calculations in solving certain
probability problems. This is demonstrated in the next two examples.
Suppose the letters in the word PROBABILITY are arranged in a random manner. Deter-
EXAMPLE 3.40
mine Pr(A) for the event
A: The arrangement begins with one letter and ends in a different letter.
‘Although our major concern in this chapter (if not the entire text) deals with F finite, when Y is infinite
Kolmogorov provided the fourth axiom:
4) if A), Az, Az, ... are events (taken from *) and A, 1 A, = W for all 1 <i < j, then
Pr (U 4) = » Pr(An).
160 Chapter 3 Set Theory
We consider four cases:
1) Start with the situation where neither B nor I appears at the start or finish. There
are seven remaining (distinct) letters. Any one of them can be used at the start of
the arrangement and there are six choices then for the last letter. For the nine letters
in between there are xm arrangements. So for this case there are (7) (525) (6) =
3,810,240 possibilities.
2) Now suppose that B is used as the first or last letter (but not in both positions) andI
only appears among the nine letters in the center. With one B so placed there are seven
other (distinct) letters that can be used at the opposite end of the arrangement. The
. + : . | .
nine remaining letters in between can be arranged in z ways, So this case accounts
for (2)(7)3 = 2,540,160 arrangements.
3) If we use one of the I’s and none of the B’s to start or end an arrangement, then there
are again 2,540,160 arrangements, as we had in case (2).
4) Finally, if one of B, I 1s used at the start and the other letter at the end, we can arrange
the remaining nine letters in between in 9! ways. So here we have the final 2(9!) =
725,760 arrangements.
Here |F| = 35; = 9,979,200, so Pr(A) = 28.58 = 3.
This result took quite a lot of calculations. So instead of the event A let us consider the
event A —that is, the event where the arrangement begins and ends with the same letter.
How many such arrangements are there? Say we use the letter B at the start and finish of the
arrangement. Then the other nine letters in between can be arranged in 3 ways. If I is used
in place of B another 2 arrangements result. So |A| = 9! and Pr(A) = aaa =Z.
With much less effort Theorem 3.7 shows us that Pr(A) = 1 — Pr(A) = S,
Due to an intense preseason workout schedule, Coach Davis has honed her volleyball
EXAMPLE 3.41
team into a major contender. Consequently, the probability her team will win any given
tournament is 0.7, regardless of any previous win or loss. Suppose the team is slated to play
eight tournaments.
a) The probability the women will win all eight tournaments is (0.7)° = 0.057648. Could
they possibly lose all eight tournaments? Yes, with probability (0.3)° = 0.000066.
b) What is the probability the team wins exactly five of the eight tournaments? One way
this can happen is if the team wins the first and second tournaments, loses the next three,
and then wins the last three. We represent this by WWLLLWWW. The probability for
this outcome is (0.7)*(0.3)? (0.7)? = (0.7)°(0.3)*. Another possibility that results in
five tournament wins can be represented by WWLLWWLW. The probability here is
(0.7)7(0.3)7(0.7)7(0.3)
(0.7) = (0.7)° (0.3). At this point we see that the probability
Coach Davis’s team wins five of the eight tournaments is
(The number of arrangements of five W’s and three L’s) X (0.7)°(0.3)°.
From the material in Sections 1.2 and 1.3, especially Example 1.22, we know that there
are <* = (2) ways to arrange five W’s and three L’s. Consequently, the probability
the team wins five tournaments is
(SJonos) = (0.254122.
3.5 The Axioms of Probability (Optional) 161
c) Finally, what is the probability the team wins at least one tournament? Let us not do
here what we did in Example 3.40. If we let A be the given event, then Pr(A) =
*_, (8)(0.7)'(0.3)8". But Pr(A) is more readily determined as 1 — Pr(A), where
Pr(A) = the probability the team loses all eight tournaments = (0.3)® = 0.000066
[as in part (a)]. Consequently, Pr(A) = 1 — (0.3)® = 0.999934.
Before we go on we want to examine the structure of the answer at the end of part (b)
of Example 3.41. Each tournament in the example results in either a win (success) or
loss (failure). Further, after the first tournament, the outcome of each later tournament is
independent of the outcome of any previous tournament. Such a two-outcome occurrence
is called a Bernoulli trial. If there are n such trials and each trial has probability p of
success and probability g (= 1 — p) of failure, then the probability that there are (exactly)
k successes among these x trials is
(Feta O<k<n.
(We shall come upon this idea again in Section 16.5 when we study the application of
Abelian groups in coding theory.)
Returning now to the axioms of probability, we know from axiom (3) that, for A, B CY,
if AM B = @then Pr(A U B) = Pr(A) + Pr(B). But what can we say if AN B 4 6?
f
Figure 3.17
For the Venn diagram in Fig. 3.17 the interior of the rectangle represents the universe —
here the sample space ¥. The shaded region in the diagram denotes the event A — B =
A B. Further,
i) the events A B and B are disjoint, since (AN B)N B= AN (BO B)=ANK=
; and
ii) (AN B)UB=(AUB)N(BUB)=(AUB)NY=AUB.
From these two observations and axiom (3) it follows that
(*) Pr(AUB) = Pr((AN B) UB) = Pr(AN
B) + Pr(B).
Next note that A= ANS =AN(BUB)=(ANB)U(AMB) where (AN B)N
(AN B) =(ANA)N
(BN B) = ANG = G. So once again axiom (3) gives us
Pr(A) = Pr(AN B)+ Pr(ANB), or
(**) Pr(AN B) = Pr(A) — Pr(AN B).
The results in Eqs. (*) and (**) now establish the following.
162 Chapter 3 Set Theory
THEOREM 3.8 The Additive Rule. If F is the sample space for an experiment ©, and A, B C Y, then
Pr(A UB) = Pr(AN B) + P(B) = Pr(A) + Pr(B) — Pr(AN B).
At this point we use the result in Theorem 3.8 in the following two examples.
Yosi selects a card from a well-shuffled standard deck. What is the probability his card is a
EXAMPLE 3.42
club or a card whose face value is between 3 and 7 inclusive?
Start by defining the events A, B as follows:
A: The card drawn is a club.
B: The face value of the card drawn is between 3 and 7 inclusive.
The answer to the problem is Pr(A U B).
Here Pr(A) = 13/52 and Pr(B) = 20/52. Also Pr(A NM B) = 5/52 —for the 3 of
clubs, 4 of clubs, ..., and 7 of clubs. Consequently, by Theorem 3.8, we have
r(AU UB) B) = Pr(A)+
Pr(A
13 20 5 228 7
Pr(A) + Pr(B)
Pr(B) -— PV(ANB)=m+5-S
PAN B= 5 + 5-5 == 5B =:
= 73
Diane inspects 120 cast aluminum rods and classifies the diameter and surface finish of
EXAMPLE 3.43
each rod as adequate or superior. Her findings are summarized in Table 3.4.
Table 3.4
Diameter
adequate superior
Surface | adequate 10 18
Finish superior 12 80
Define the events A, B as follows:
A: The diameter of the rod is classified as superior.
B: The surface finish of the rod is classified as superior.
Then
Pr(A) = (18 + 80)/120 = 98/120 = 49/60 = 0.816667
Pr(B) = (12 + 80)/120 = 92/120 = 23/30 = 0.766667
Pr(AN B) = 80/120 = 2/3 = 0.666667.
By Theorem 3.8
Pr(AU B) = Pr(A) + Pr(B) — Pr(AN B)
98 2 11
= 8
120 120
0 OL
12006©6120)=6«612
0.916667.
So 110 [= 110.40 = (0.92)(120)] of these 120 rods have either a superior diameter or a
superior surface finish, or perhaps both.
3.5 The Axioms of Probability (Optional) 163
In addition,
Pr(A) = the probability the diameter of the rod is classified as adequate = W242)
120 =
22 =_ | - fp98 = 1 Pr(A),
po and
Pr(B) = the probability the surface finish of the rod is classified as adequate = es =
120 120 ,
Using DeMorgan’s Laws we also find that Pr(A U B) = Pr(AN B) = 1— Pr(AN B)=
1 —$=4,and Pr(ANB) = Pr(AUB)=1- Pr(AUB)=1-He= 3.
Now we want to extend the result of Theorem 3.8 to more than two events. The following
theorem deals with three events and suggests the pattern for four or more.
THEOREM 3.9 Let # be the sample space for an experiment @. For events A, B, C CY,
Pr(AUBUC)=
Pr(A) + Pr(B) + Pr(C) — Pr(AN B) — Pr(ANC) — Pr(BNC) + Pr(ANBNC).
Proof: The Laws of Set Theory from Section 3.2 validate what follows:
Pr(AUBUC) = Pr(AUB)UC) = Pr(AU B)4+ Pr(C) — Pr((AU B)NC)
= Pr(A)+ Pr(B) — Pr(AN B)+ Pr(C) — Pr(ANC)U(BNC))
= Pr(A)+ Pr(B) + Pr(C) — Pr{An B)
~[Pr(ANC)+ Pr(BNC)— Pr(ANC)N(BNC))]
= Pr(A) + Pr(B) + Pr(C) — Pr(AN B)
—~ Pr(ANC)— Pr(BNC)+ PrAN BNC).
Note that the last equality follows because (AN C)N(BNC)=ANBNC by the
Associative, Commutative, and Idempotent Laws of Intersection. Also note the simi-
larity between the formula for Pr(A U B UC) and that for |A U B U C| (given prior to
Example 3.27).
Further, we see that the formula for Pr(A U B UC) involves 7 (= 2? — 1) summands.
For four events we would have 15 (= 2* — 1) summands: (i) 4 = (;) summands
— one
for each single event; (ii) 6 = (5) summands
— one for each pair of events; (iii) 4 = (3)
summands
— one for each triple of events; and (iv) 1 = (3) summand for all four of
the events. When dealing with # events, A;, A2,..., A,, where n > 2, the formula for
Pr(A; UA, U--+UA,) has a total of )°"_, (‘) = a (7) — (5) = 2" — 1 summands,
by Corollary 1.1. For 1 <r <n, there are (7) summands — one for each way we can select
r of the n events. Each of these summands is preceded by a plus sign, for r odd, or a minus
sign, for r even.
We’ ll see more formulas like the one in Theorem 3.9 in Section 8.1. For now let us apply
the result of this theorem in the following example.
The game of Roulette is played by initially spinning a small white ball on a circular wheel that
EXAMPLE 3.44
is divided into 38 sections of equal area. These sections are labeled 00, 0,1, 2, 3,..., 36.
164 Chapter 3 Set Theory
As the wheel slows down, the number of the section where the ball comes to rest is the
outcome for that one play of the game.
The numbers on the wheel are colored as follows.
Green: 00 O
Red: 1 3 5 7 9 12 14 16 18
19 21 23 25 27 30 32 34 36
Black: 2 4 6 8 10 11 #13 15 #217
20 22 24 26 28 29 31 33 35
A player may place bets in various ways, such as (1) odd, even (here 00 and 0 are considered
neither even nor odd); (11) low (1-18), high (19-36); or (ili) red, black.
Gary enjoys Roulette and decides to place bets according to the events.
A: The outcome is low. B: The outcome is red. C: The outcome is odd.
What is the probability Gary wins at least one of his bets — that is, whatis Pr(A U BUC)?
Here Pr(A) = Pr(B) = Pr(C) = 18/38, Pr(AN B) = Pr(ANC) = 9/38,
Pr(B NC) = 10/38, Pr(AN BNC) = 5/38, and by Theorem 3.9
18 18 18 9 9 10 #5 31
Pr(AUBUC)=—4+—-4+=>-—5-=—-= —=— =(081 ;
r( ©) = 3g + 3g + 3g — 3g 38 38 7 38 3g 8178?
In closing this section we need to make one more point. The examples we’ ve seen here
and in the previous section have all dealt with finite sample spaces. Yet it is possible to have
situations where a sample space is infinite. For instance, suppose a man takes a driver’s test
until he passes it. If he passes the test on his first try, we write P for this outcome. Should
he need three attempts to pass the test, then we write FFP to denote the first and second
failures followed by his passing of the test. Hence the sample space may be given here as
ff = {P, FP, FFP, FFFP... .}, an example of a countably infinite’ set.
When dealing with sample spaces that are finite or countably infinite, we call the sample
space discrete. The coverage here in Chapter 3 deals strictly with discrete sample spaces
that are finite. However, in Section 9.2, we’ll consider an example where the sample space
is countably infinite.
Finally, suppose an experiment calls for a technician to record the temperature, in degrees
Fahrenheit, of a heated iron rod. Theoretically, the sample space here could comprise an
open interval of real numbers — for instance, f = {t|180°F < t < 190°F}. Here the sample
space is again infinite, but this time it is uncountably* infinite. In this case the sample space
is called continuous and now one needs calculus to solve the related probability problems.
We will not pursue this here but will direct the interested reader to the chapter references —
especially, the text by J. J. Kinney [7].
Pr(AUB), Pr(AUB), Pr(ANB), Pr(AN B),
EXERCISES 3.5 Pr(AUB), and Pr(A UB).
1. Let & be the sample space for an experiment 6 and 2. Ashley tosses
a fair coin eight times. What is the probability
let A, B be events from Y, where Pr(A) = 0.4, Pr(B) = she gets (a) six heads; (b) at least six heads; (c) two heads; and
0.3, and Pr(ANB)=0.2. Determine Pr(A), Pr(B), — (d) at most two heads?
“The interested reader can find more on countable sets in Appendix 3.
+ . .
*More on uncountable sets can be found in Appendix 3.
3.5 The Axioms of Probability (Optional) 165
3. Ten ping-pong balls labeled | to 10 are placed in a box. Let A, B denote the events
Two of these balls are then drawn, in succession and without
A: The sample has foam type 1.
replacement, from the box.
B: The sample meets specifications.
a) Find the sample space for this experiment.
b) Find the probability that the label on the second ball Determine Pr(A), Pr(B), Pr(AMB), Pr(AUB), Pr(A),
drawn is smaller than the label on the first. Pr(B), Pr(AU B), Pr(AN B), Pr(A A B).
c) Find the probability that the label on one ball is even 11. Consider the game of Roulette as described in Example
while the label on the other is odd. 3.44,
4. Russell draws one card from a standard deck. If A, B, C a) If the game is played once, what is the probability the
denote the events outcome is (i) high or odd; (i1) low or black?
b) If the game is played twice, what is the probability
A: The card is a spade.
(i) both outcomes are black; (ii) one outcome is red and
B: The card is red. the other green?
C: Thecard is a picture card (that is, ajack, queen, or king). 12. Let & be the sample space for an experiment @ and
Find Pr(AU BUC). let A,BCY. If Pr(A) = Pr(B), Pr(AN B) = 1/5, and
Pr(A U B) = 1/5, determine Pr(A U B), Pr(A), Pr(A — B),
5. Let £ be the sample space for an experiment @. If A, B are
Pr(A A B).
disjoint events from F with Pr(A) = 0.3 and Pr(A UB) =
0.7, what is Pr(B)? 13. The following data give the age and gender of 14 science
professors at a small junior college.
6. If F is the sample space for an experiment and A, B CY,
how is Pr(A A B)relatedto Pr(A), Pr(B), and Pr(AM B)? 25M 39 F 27 F 53M 36 F 37F 30M
[Note: Pr(A A B) 1s the probability that exactly one of the 29F 32M 31M 38 F 26M 24F 40F
events A, B occurs.]
One professor will be chosen at random to represent the fac-
7. Adie is loaded so that the probability a given number turns
ulty on the board of trustees. What is the probability that the
up is proportional to that number. So, for example, the out-
professor chosen is a man or over 35?
come 4 is twice as likely as the outcome 2, and the outcome 3
is three times as likely as that of 1. If this die is rolled, what is 14, The nine members of a coed intramural volleyball team are
the probability the outcome is (a) 5 or 6; (b) even; (c) odd? to be randomly selected from nine college men and ten college
women. To be classified as coed the team must include at least
8. Suppose we have two dice — each loaded as described in
one player of each gender. What is the probability the selected
the previous exercise. If these dice are rolled, what is the prob-
team includes more women than men?
ability the outcome is (a) 10; (b) at least 10; (c) a double?
15, While traveling through Pennsylvania, Ann decides to buy
9. Juan tosses a fair coin five times. What is the probability
a lottery ticket for which she selects seven integers from 1 to
the number of heads always exceeds the number of tails as each
80 inclusive. The state lottery commission then selects 11 of
outcome is observed? these 80 integers. If Ann’s selection matches seven of these 1 |
10. Three types of foam are tested to see if they meet specifi- integers she is a winner, What is the probability Ann is a winner?
cations. Table 3.5 summarizes the results for the 125 samples
16. Let S be the sample space for an experiment © and let
tested.
A, B CF with A C B. Prove that Pr(A) < Pr(B).
Table 3.5 17. Let F be the sample space for an experiment 6, and
let A, BCY. If Pr(A) =0.7 and Pr(B) = 0.5, prove that
Specifications Are Met
Pr(AN B) > 0.2.
No Yes
Foam | | 5 60
Type | 2 7 30
3 8 15
166 Chapter 3. Set Theory
3.6
Conditional Probability: Independence
(Optional)
Throughout Sections 3.4 and 3.5 especially
— prior to and at the end of Example 3.35, as
well as in and after Example 3.41 — we mentioned the idea of the independence of outcomes.
There we questioned whether the occurrence of a certain outcome might somehow affect the
occurrence of another outcome. In this section we extend this idea from a single outcome to
an event and make it more mathematically precise. To do so we proceed with the following.
Vincent rolls a pair of fair dice. The sample space & for this experiment is shown in Fig. 3.18,
EXAMPLE 3.45
along with the events
A: The sum (on the faces) is at least 9.
B: Adouble is rolled.
(1,5), (1, 6),
(2,5), (2, 6),
aN
(3, 2), G,9),-(3, 8) tp
“i i
(4,1), (4, 2), !
j
!
(5,1), (5, 2), |
1
7 1
(6,1), (6,2).776,3), 6,4),
Figure 3.18
We see that Pr(A)= a ==2, Pr(B)= od and Pr(AM B) = Pr(BNA)= = = =
But now, instead of just asking about ththe ene of the occurrence of event B, we
go one step further. Here we want to determine the probability of the occurrence of event
B given the condition that event A has occurred. This conditional probability is denoted by
Pr(B|A) and may be determined as follows.
The occurrence of event A reduces the sample space from the 36 equally-likely ordered
pairs in ¥ to the 10 equally-likely ordered pairs in A. Among the ordered pairs in A,
two are also doubles —namely, (5,5) and (6, 6). Consequently, the probability of B given
A = Pr(B\A)= 2,
io
and we notice that 2 = G40
(10/36) = ao
Before we suggest the result at the end of Example 3.45 as a general formula, let us
consider a second example — one where the outcomes are not equally likely.
Lindsay has a coin that is biased with Pr(H)= <; and Pr(T)= 5. She tosses this coin
EXAMPLE 3.46
three times, where the result of each toss is independent of any ceed result. The eight
possible outcomes in the sample space have the following probabilities:
3.6 Conditional Probability: Independence (Optional) 167
Pr(HHH) = (2) = &
3 2
Pr(HTT) = Pr(THT) = Pr(TTH) = (2) (3) =F
Pr(tTt) = (1) = 4.
[Note that the sum of these probabilities is $ + 3 (4) +3 (4)7 +3 = 45%
27 = 1)
Consider the events
A: The first toss results in a head [so A = {HTT, HTH, HHT, HHH} and
Pr(A)=3+2(F)+5 = $3].
B: The number of heads is even [so B = {TTT, HHT, HTH, THH} and
Pr(B) = 3 4+3(s)=8].
Furthermore, A 9 B = {HTH, HHT} and Pr(B 1 A) = Pr(ANB)=3+4 =.
To determine the conditional probability of B given A —thatis, Pr(B|A) — we'll make
A our new sample space and redefine the probability of the four outcomes in A as follows:
7 — Pr(HTT)— (2/27) _ 1 ! = Pr(HT — 4/27)
H) _ 2
Pr(HYT) = “Pr(Ay)—s(18/2)—s«*O@S Pr'(HTH) = Pr(A) (18/27) 9
f _ Pr(HHT) _ (4/27) _ 2 , _ Pr(HHH) _ (8/27) _. 4
Pr(HHT) = ray = san = 5 Pr’ (HHA) = ra 8728
(We see that Pr’(HTT) + Pr’(HTH) + Pr’'(HHT) + Pr’(HHH) = 5 +24+5+2=1)
Among the four outcomes in A, two of them satisfy the condition given in event B —
namely, HTH and HHT, the outcomes in BM A. Consequently, Pr(B|A) = Pr’(HTH) +
Pr'(HHT)
= § + 5 = 9 = ig = isan = Pray
t —_ 2 242 8 _ 8/27 _ Pr( BNA)
Motivated by the final result in each of the last two examples, we now summarize the
underlying general procedure. We want a formula for Pr(B|A), the conditional probability
of the occurrence of event B given the occurrence of event A. Further, this formula should
help us avoid unnecessary calculations such as those in Example 3.46, where we recalculated
the probability of each outcome in A.
Now once we know that the event A has occurred, the sample space ¥ shrinks to the
outcomes in A. If we divide the probability of each outcome in A by Pr(A), as in Example
3.46, the sum of these new probabilities sums to 1, so A can serve as the new sample
space. Further, suppose ¢), e2 are two outcomes in Y with Pr(erz) = kPr(e,), where k is
a constant. If e;, €2 € A, then within the new sample space A the probability of e; is still k
times that of e>.
To calculate Pr(B|A) we now consider those outcomes in event A that are in event B.
This gives us the outcomes in event BM A and leads us to the following.
if ¥ is the sample space for an experiment 6 and A, BC Ff, then
Pr(BNA)
the conditional probability of B given A = Pr(B\A) = —————,
Pr(A)
so long as Pr(A) # 0.
Further,
Pr( BM A) = Pr{(AN B) = Pr(A)Pr(B\A),
168 Chapter 3 Set Theory
and upon changing the roles of A and B we have
Pr(AfN B) = Pr(BO A) = Pr(B)Pr(A|B).
The result
Pr(A)Pr(B\|A) = Pr(AN B) = Pr(B)Pr(A|B)
is often called the multiplicative rule.
Without realizing it, we actually used the multiplicative rule in Example 3.39 — in the
case where the motors were not replaced after inspection. The first part of our next example
now reinforces how we use this rule.
A cooler contains seven cans of cola and three cans of root beer. Without looking at the
EXAMPLE 3.47
contents, Gustavo reaches in and withdraws one can for his friend Jody. Then he reaches in
again to get a can for himself.
Let A, B denote the events
A: The first selection is a can of cola.
B: The second selection is a can of cola.
a) Using the multiplicative rule, the probability that Gustavo chooses two cans of cola is
7 6 7
Pr(AQ B) = Pr(A)Pr(B\A) = (35) @ = 75:
[Here Pr(B|A) = 6/9 because after the first can of cola is removed, the cooler then
contains six cans of cola and three of root beer.]
b) The multiplicative rule and the additive rule (of Theorem 3.8) tell us that the probability
Gustavo selects two cans of cola or two cans of root beer is
Pr(AN B) + Pr(AN B) = P(A)P(B\A) + Pr(A)Pr(B\A)
= (10) (5) = (ao) (5) “a5
c) Finally, let us determine Pr(B). To do so we develop a new formula with the help of
the Venn diagram (for a sample space F and events A, B) in Fig. 3.19. From the fig-
ure (and the laws of set theory) we see that B = BM LS=BN(AUA)=(BNA)U
(BO A), where (BN AYN(BNA)=BN(ANA)D=BNG=EB.
Pr(B) = Pr(BQNA)+ Pr(BN A)
(5)(6)-(3)Q)-8"i
= Pr(A)Pr(B|A) + Pr(A)Pr(B/A)
Figure 3.19
3.6 Conditional Probability: Independence (Optional) 169
The result at the end of Example 3.47 — namely, for A, BC
Pr(B) = Pr(A)Pr(B|A) + Pr(A)Pr(BjA)
is referred to as the Law of Total Probability. Our next example shows how this result can
be generalized.
Emilio is a system integrator for personal computers. As such he finds himself using key-
EXAMPLE 3.48
boards from three companies. Company 1 supplies 60% of the keyboards, company 2
supplies 30% of the keyboards, and the remaining 10% comes from company 3. From past
experience Emilio knows that 2% of company 1’s keyboards are defective, while the per-
centages of defective keyboards for companies 2, 3 are 3% and 5%, respectively. If one of
Emilio’s computers is selected, at random, and then tested, what is the probability it has a
defective keyboard?
Let A denote the event
A: The keyboard comes from company 1.
Events B, C are defined similarly for companies 2, 3, respectively. Event D, meanwhile, is
D: The keyboard is defective.
Here we are interested in Pr(D). Guided by the Venn diagram in Fig. 3.20, we see that
D=DNL=DN(AUBUC)=(DNA)U(DNB)U(DNC). But here AN B=
ANC = BMC =8. So now, for example, the Laws of Set Theory show us that (DM A)
N(DNB)=DN(ANB)=DNB=H. Likewise, (DN A)N (DNC) =(DNB)N
(DNC) =%,and(DN AYN(DNB)N(DNAC) = &. Consequently, by Theorem 3.9, we
have
Pr(D) = Pr(DN A) + Pr(DN B) + Pr(DNC)
= Pr(A)Pr(D|A) + Pr(B)P(D|B) + Pr(C)Pr(D|C).
(Here we have the Law of Total Probability for three sets; that is, the sample space & is the
union of three sets, any two of which are disjoint.)
Figure 3.20
From the information given at the start of this example we know that
Pr(A) = 0.6 Pr(B) = 0.3 Pr(C) = 0.1
Pr(D|A) = 0.02 Pr(D|B) = 0.03 Pr(D|C) = 0.05.
So Pr(D) = (0.6)(0.02) + (0.3)(0.03) + (0.1)(0.05) = 0.026, and this tells us that 2.6%
of the personal computers integrated by Emilio will have defective keyboards.
170 Chapter 3 Set Theory
The next example takes us back to the situation in Example 3.48 and introduces us to
Bayes’ Theorem. As with the Law of Total Probability, the situation here likewise general-
izes — that is, when appropriate, Bayes’ Theorem may be applied to any sample space F
that is decomposed into two or more events that are disjoint in pairs.
Referring back to the information in the preceding example, now we ask the question “If
EXAMPLE 3.49
one of Emilio’s personal computers is found to have a defective keyboard, what is the
probability that keyboard came from company 3?”
Using the notation in Example 3.48 we see that here the given condition is D and that
we want to find Pr(C|D).
Pr(C|D) = Pr(C OD) _ Pr(C)Pr(DIC)
" Pr(D) Pr(A)Pr(D|A) + Pr(B)Pr(D|B) + Pr(C)Pr(DIC)
(0.1) (0.05) 0.005 5
= = = — = 0.192308.
(0.6) (0.02) + (0.3)(0.03) + (0.1)(0.05) 0.026 26
[Before leaving this example let us observe a small point. Since we have a choice on how
to rewrite the numerator of Se do we know we’ve made the correct choice? Yes!
The other choice, namely, Pr(C 1 D) = Pr(D)Pr(C|D), would tell us that Pr(C|D) =
Pr(CAD) _ Pr(D)Pr(C|D) _
BBY = ne I?) = Pr(C|D), a correct but not very useful result.]
Having dealt with the Law of Total Probability and Bayes’ Theorem, it is now time to
settle the issue of independence. In our work on conditional probability we learned earlier
that for events A, B, taken froma sample space ¥, Pr(A B) = Pr(A)Pr(B|A). Should
the occurrence of event A have no effect on that of B, we have Pr(B|A) = Pr(B)—and
so event B is independent of event A. These considerations now guide us to the following.
Definition 3.12 Given a sample space & with events A, B C &, we call A, B independent when
Pr(AQ B) = Pr(A)Pr(B).
For A, B C &, the general situation has Pr(B)Pr(A|B) = Pr(B A) = Pr(AN B)=
Pr(A)Pr(B|A). Using this and the result in Definition 3.12 we now have three ways to
decide when A, 8 are independent:
1) Pr(AfN B) = Pr(A)Pr(B);
2) Pr(A|B) = Pr(A); or
3) Pr(B|A) = Pr(B).
We also realize that A is independent of B if and only if B is independent of A.
Our next example uses the preceding discussion to decide whether two events are inde-
pendent.
Suppose Arantxa tosses a fair coin three times. Here the sample space = {HHH, HHT,
EXAMPLE 3.50
HTH, THH, HTT, THT, TTH, TIT}, where each outcome has probability z:
Consider the events
A: The first toss is H: A = {HHH, HHT, HTH, HTT} and Pr(A) = 5;
B: The second toss is H: 8B = {HHH, HHT, THH, THT} and Pr(B) = 3;
3.6 Conditional Probability: Independence (Optional) 171
C: There are at leasttwo H’s: C = {HHT, HTH, THH, HHH} and Pr(C) = 5
a) A‘ B = {HHH, HHT}, so Pr(AN B) = 4 = (5) (5) = Pr(A)Pr(B). Conse-
quently, the events A, B are independent.
b) ANC = {HHH, HHT, HTH}, so P(A MC) = = # (3) ($) = Pr(A)Pr(C). There-
fore, the events A, C are not independent.
c) Likewise, Pr(B OC) = 3 # (4) ($) = Pr(B)Pr(C) so B,C are also not
independent.
d) The event B = {TTT, TTH, HTT, HTH} and Pr(B) = 3. Further AN B = {HTH,
HTT} with Pr(AM B) = + = (4) (4) = Pr(A)Pr(B). So not only are the events
A, B independent but the events A, B are also independent.
The first part of the following theorem shows us that what has happened here in parts
(a) and (d) is not an isolated instance.
THEOREM 3.10 Let A, B be events taken from a sample space &. If A, B are independent, then (a) A, B
are independent; (b) A, B are independent; and (c) A, B are independent.
Proof: [| We shall prove part (a) and leave the proofs of parts (b), (c) for the Section Exercises.]
Since A=ANL=AN(BUB)=(ANB)U(ANB) and (ANB)N(ANB)=
AN(BN B) =ANB=B, wehave Pr(A) = Pr(AM B) + Pr(AN B). With A, B inde-
pendent, it follows that Pr(A M B) = Pr(A)Pr(B). The last two equations imply that
Pr(AQ B) = Pr(A) — Pr(AN B) = Pr(A) — Pr(A)Pr(B) = Pr(A)[1 — Pr(B)] =
Pr(A) Pr(B). Consequently, from Definition 3.12 we know that A, B are independent.
Our next example will help motivate the idea of independence for three events.
EXAMPLE 3.51 Tino and Monica each roll a fair die. If we let x denote the result of Tino’s roll and y that of
: Monica’s, then once again ¥ = {(x, y)|1 <x, y < 6}. Now consider the events A, B, C:
A: Tino rolls a 1, 2, or 6.
B: Monica rolls a 3, 4, 5, or 6.
C: The sum of Tino’s and Monica’s rolls is 7.
Here Pr(A) = < = 5, Pr(B) = G = x, and Pr(C) = 4 = fe Further,
AN B= {(a, b)\a € {1, 2, 6}, b € {3, 4, 5, 6}}, so [AN BJ = 12 and Pr(AN B) =
& = 5 = (5) (2) = Pr(A)Pr(B),so A, B are independent;
ANC = {(1, 6), (2, 5), (6, l)}and Pr(ANC) = = = + = (5) (2) = Pr(A)Pr(C),
making A, C independent;
BNC = {(4, 3), (3, 4), (2, 5), (1, 6)} and Pr(BNC) = 4 = 5 = (2) (4) =
Pr(B)Pr(C), so B, C are also independent.
Finally,
AN BNC = {(1, 6), (2, 5)} and P(ANBNC)=2= 4 =(3) (3) ()=
Pr(A)Pr(B)Pr(C).
What has happened in Example 3.51] leads us to the following.
172 Chapter 3 Set Theory
Definition 3.13 For a sample space ¥ and events A, B, C C &, we say that A, B, C are independent if
1) Pr(AN B) = Pr(A)Pr(B);
2) Pr(ANC) = Pr(A)Pr(C);
3) Pr(B OC) = Pr(B)Pr(C); and
4) Pr(ANBNOC)= Pr(A)Pr(B)Pr(C).
Looking back now at Example 3.51 we see that there we verified the independence of the
events A, B, C. But did we do too much? In particular, do we really need condition (4) in
Definition 3.13? Perhaps we may feel that the first three conditions are enough to insure the
fourth condition. But, perhaps, they are not enough. The next example will help us settle
this issue.
Adira tosses a fair coin four times. So in this case the sample space ¥ =
| EXAMPLE 3.52
{x1X2x3xX4|x, € {H, T}, | <i < 4}.
Let A, B, C CY& be the events:
A: Adira’s first toss is a tail (T);
B: Adira’s last toss is a tail (T); and
C: The four tosses yield two heads and two tails.
For these events we find that Pr(A) = a = 5 Pr(B) = * = $s and Pr(C) = xz (3) 7
In addition,
Pr(AN B) = 4 = 4 = (4) (§) = Pr(A)Pr(B),
Pr(ANC) = 7 = (5) (2) = Pr(A)Pr(C), and
Pr(BNC)= 4% = (5)(3) = Pr(B)Pr(C).
However, AN BMC
= {THHT} and Pr(ANBNC)=%=4 #4 ¥ = (5) (3) (3) =
Pr(A)Pr(B)Pr(C). So while the three events in Example 3.51 are independent, the three
events in this example are (mutually) independent in pairs — but not independent.
In closing this section we provide a summary of the probability rules and laws we have
learned in this and the preceding section.
Summary of Probability Rules and Laws
1) The Rule of Complement: Pr(A) = 1— Pr(A)
2) The Additive Rule: Pr(A U B) = Pr(A) + Pr{B) — Pr(An B).
When A, B are disjoint, Pr(A U B) = Pr(A) + Pr(B).
ses ‘ eye Pr{An B)
3) Conditional Probability: Pr(A|B) = TPB Pr(B) #0 |
4) Multiplicative Rule: Pr(A)Pr(B\A) = Pr(AN B) = Pr(B)Pr(A\B).
When A, B are independent, Pr(A 9 B) = Pr(A)Pr(B).
3.6 Conditional Probability: Independence (Optional) 173
5) The Law of Total Probability: Pr(B) = Pr(A)Pr(B\A) + Pr(A)Pr(BIA)
6) The Law of Total Probability (Extended Version): If Ay, Az, ..., An © F, where
n>3, A; A; = @foralll <i <j <n, and = U%_, Aj, then for any event B,
Pr(B) = Pr(Ay)Pr(BlA\) ++ +++ Pr(Ay)Pr(BlA,) = > Pr(Aj) Pr(BlAi).
ix}
Pr(iAn B) = Pr(A)Pr(BIA)
7) Bayes’Theorem: Pr(A|B) = Pr(B) Pr(A)Pr(BiA) + Pr(A)Pr(BlA)
8) Bayes’ Theorem (Extended Version): If Ay, Ao, - .<> An SY, where n > 3,
A; VA; = @ for all 1<i <j <n, and F = Ut, Aj, then for any event B, and
each i <k <n,
Pr{A;y B) _ Pr(Ay)Pr(BiAx)
Pr(A;|B) = Pr(B) Pr(Ay)Pr(BiAy) + «++ Pr(An,)Pr(BiAn)
_ __Pr(Ax)Pr(BlA)
yn, Pr(A;)Pr(BlAjy)
b) Derek’s class is making extensive use of the CAS. What
943 ah ARE) is the probability Derek is taking discrete mathematics?
1. Recall that in a standard deck of 52 cards there are 12 pic- 5. Let & be the sample space for an experiment € and let A, B
ture cards — four each of jacks, queens, and kings. Kevin draws be events from &. If A, B are independent, prove that
one card from the deck. Find the probability his card is a king Pr(AU B) = Pr(A) + Pr(A)Pr(B)
if we know that the card drawn is an ace or a picture card.
2. Let A, B be events taken from a sample space &.
= Pr(B) + Pr(B)Pr(A).
If Pr(A) = 0.6, Pr(B) =0.4, and Pr(A UB) =0.7, find 6. Ceilia tosses a fair coin five times. What is the probabil-
Pr(A|B) and Pr(A|B). ity she gets three heads, if the first toss results in (a) a head;
3. If Coach Mollet works his football team throughout August, (b) a tail?
then the probability the team will be the division champion is 7. One bag contains 15 identical (in shape) coins — nine of
0.75. The probability the coach will work his team throughout silver and six of gold. A second bag contains 16 more of these
August is 0.80. What is the probability Coach Mollet works his coins — six silver and 10 gold. Bruno reaches in and selects one
team throughout August and the team finishes as the division coin from the first bag and then places it in the second bag. Then
champion? Madeleine selects one coin from this second bag.
4. The 420 freshmen at an engineering college take either a) What is the probability Madeleine selected a gold coin?
calculus or discrete mathematics (but not both). Further, both
b) If Madeleine’s coin is gold, what is the probability
courses are offered providing either an introduction to a CAS
Bruno had selected a gold coin?
(computer algebra system) or using such a system extensively
throughout the course. The results in Table 3.6 summarize how 8. A coin is loaded so that Pr(H) = 2/3 and Pr(T) = 1/3.
the 420 freshmen are distributed. Todd tosses this coin twice.
Let A, B be the events
Table 3.6
CAS CAS A: The first toss isa tail. B: Bothtosses are the same.
(Introduction) (Extensive
Coverage) Are A, B independent?
9. Suppose that A, B are independent with Pr(A U B) = 0.6
Calculus 170 120
and Pr(A) = 0.3, Find Pr(B).
Discrete Mathematics 80 50
10. Alice tosses a fair coin seven times. Find the probability
a) If Sandrine is taking calculus, what is the probability she gets four heads given that (a) her first toss is a head; (b) her
her class is only being introduced to the use of a CAS? first and last tosses are heads.
174 Chapter 3. Set Theory
11. Paulo tosses a fair coin five times. If A, B denote the events C: The tosses result in one head and one tail.
A: Paulo gets an odd number of tails. Are the events A, B, and C independent?
B: Paulo’s first toss is a tail. 20. Three missiles are fired at an enemy arsenal. The probabil-
ities the individual missiles will hit the arsenal are 0.75, 0.85,
are A, B independent?
and 0.9. Find the probability that at least two of the missiles hit
12. The probability that a certain mechanical component fails the arsenal.
when first used is 0.05. If the component does not fail immedi-
21. Dustin and Jennifer each toss three fair coins. What is the
ately, the probability it will function correctly for at least one
probability (a) each of them gets the same number of heads?
year is 0.98. What is the probability that a new component func-
(b) Dustin gets more heads than Jennifer? (c) Jennifer gets more
tions correctly for at least one year?
heads than Dustin?
13. Paul has two coolers. The first contains eight cans of cola
22. Tiffany and four of her cousins play the game of “odd person
and three cans of lemonade. The second cooler contains five
out” to determine who will rake up the leaves at their grand-
cans of cola and seven cans of lemonade. Paul randomly se-
mother Mary Lou’s home. Each cousin tosses a fair coin. If the
lects one can from the first cooler and puts it into the second
outcome for one cousin is different from that of the other four,
cooler. Five minutes later Betty randomly selects two cans from
then this cousin has to rake the leaves. What is the probability
the second cooler. If both of Betty’s selections are cans of cola,
that a “lucky” cousin is determined after the coins are flipped
what is the probability Paul initially selected a can of lemonade?
only once?
14. Let & be the sample space for an experiment © and let
23. Ninety percent of new airport-security personnel have had
A,B,C CY. If events A, B are independent, events A, C
prior training in weapon detection. During their first month on
are disjoint, and events B, C are independent, find Pr(B) if
the job, personnel without prior training fail to detect a weapon
Pr(A) = 0.2, Pr(C) = 0.4, and Pr(AU BUC) =0.8.
3% of the time, while those with prior training fail only 0.5%
15. An electronic system is made up of two components con- of the time. What is the probability a new airport-security em-
nected in parallel. Consequently, the system fails only when ployee, who fails to detect a weapon during the first month on
both of the components fail. The probability the first component the job, has had prior training in weapon detection?
fails is 0.05 and, when this happens, the probability the second
24. The binary string 101101, where the string is unchanged
component fails is 0.02. What is the probability the electronic
upon reversing order, is called a palindrome (of length 6). Sup-
system fails?
pose a binary string of length 6 is randomly generated, with 0,
16. Gayla has a bag of 19 marbles of the same size. Nine of 1 equally likely for each of the six positions in the string. What
these marbles are red, six blue, and four white. She randomly is the probability the string is a palindrome if the first and sixth
selects three of the marbles, without replacement, from the bag. bits (a) are both 1; (b) are the same?
What is the probability Gayla has withdrawn more red than
25. In defining the notion of independence for three events
white marbles?
we found (in Definition 3.13) that we had to check four con-
17. Let A, B, C be independent events taken from a sample ditions. If there are four events, say E,, E2, F3, E4, then we
space ¥. If Pr(A) = 1/8, Pr(B) = 1/4, and Pr(AU BUC) have to check 11 conditions
— six of the form Pr(E; N Ej) =
= 1/2, find Pr(C). Pr(£,)Pr(E,),1<i<j <4; four of the form Pr(E,N
18. Acompany involved in the integration of personal comput- E; Ey) = Pr(E;)Pr(E,)Pr(Ey), |<i<j <k <4; and
ers gets its graphics cards from three sources. The first source Pr(E, 0 £,.90 E39 £4) = Pr(£\) Pr (£2) Pr( £3) Pr (£4).
provides 20% of the cards, the second source 35%, and the third (a) How many conditions need to be checked for the indepen-
source 45%, Past experience has shown that 5% of the cards dence of five events? (b) How many for n events, where n > 2?
from the first source are found to be defective, while those from 26. Let A, B be events taken from a sample space F. If Pr(AN
the second and third sources are found to be defective 3% and B) =0.1 and Pr(A MB) = 0.3, whatis Pr(A A BIAUB)?
2%, respectively, of the time.
27. Urn | contains 14 envelopes (of the same size) — six each
a) What percentage of the company’s graphics cards are contain $1 and the other eight each contain $5. Urn 2 contains
defective? eight envelopes (of the same size as those in urn 1)— three
b) If a graphics card is selected and found to be defective, each contain $1 and the other five each contain $5. Three en-
what is the probability it was provided by the third source? velopes are randomly selected from urn | and transferred to urn
2. If Carmen now draws one envelope from urn 2, what is the
19. Gustavo tosses a fair coin twice. For this experiment con-
probability her selection contains $1?
sider the following events:
28. Let A, B be events taken from a sample space & (with
A: The first toss is a head.
Pr(A) >Oand Pr(B) > 0). If Pr(B|A) < Pr(B), prove that
B: The second toss is a tail. Pr(A|B) < Pr(A).
3.7 Discrete Random Variables (Optional) 175
29. Let A, B be events taken from a sample space ¥. If 30. Let
¥ be the sample space for an experiment
€, with events
Pr(A) = 0.5, Pr(B) = 0.3, and Pr(A|B) + Pr(BlA) = 0.8, A, BCS. Tf Pr(A|B) = Pr(A A B)=0.5 and Pr(A U B)
what is Pr(AN B)? = 0.7, determine Pr(A) and Pr(B).
3.7
Discrete Random Variables (Optional)
In this section we introduce a fundamental idea in the study of probability and statistics —
namely, the random variable. Since we are dealing exclusively with discrete sample spaces,
we shall deal only with discrete random variables. Consequently, whenever the term ran-
dom variable arises, it is understood that it is a discrete random variable — that is, a random
variable defined for a discrete sample space. [Those interested in continuous random vari-
ables should consult the chapter references. Chapter 3 of the text by John J. Kinney [7] is
an excellent starting point.]
We introduce the concept of a random variable in an informal way. The following
example will help us do this.
| EXAMPLE 3.53 If Keshia tosses a fair coin four times, the sample space for this random experiment may
be given as
¥ = {HHHH,
HHHT, HHTH, HTHH, THHH,
HHTT, HTHT, HTTH, THHT, THTH, TTHH,
HTTT, THTT, TTHT, TTTH,
TTTT}.
Now, for each of the 16 strings of H’s and T’s in ¥, we define the random variable X as
follows:
For x1.x2%3x4 € F, X(x%1x2x3x4) counts the number of H’s that appear among the four
components x), «2, x3, x4. Consequently,
X (HHHH) = 4,
X (HHHT) = X (HHTH) = X (HTHH) = X (THHH) = 3,
X (HHTT) = X (HTHT) = X(HTTH) = X(THHT) = X(THTH) = X(TTHH) = 2,
X (HTTT) = X(THTT) = X(TTHT) = X(TTTH) = 1, and
X(TTTT) = 0.
We see that X associates’ each of the 16 strings of H’s and T’s in & with one of the
nonnegative integers in {0, 1, 2, 3, 4} (asubset of R). This allows us to think of an outcome
in & in terms of a real number. Further, suppose we are interested in the event
A: the four tosses result in two H’s and two T’s.
This association by X between the strings in and the nonnegative integers 0, 1, 2, 3, 4 is an example of
a function — an idea to be covered in detail in Chapter 5. In general, a random variable is a function from the
sample space ¥ of an experiment € to R, the set of real numbers. The domain of any random variable X is f and
the codomain is always R. The range in this case is {0, 1, 2, 3, 4}, (The concepts of domain, codomain, and range
are formally defined in Section 5.2.)
176 Chapter 3 Set Theory
In our earlier work we might have described this event by writing
A = {HHTT, HTHT, HTTH, THHT, THTH, TTHH}.
Now we can summarize the six outcomes in this event by writing A =
{x1x2x3x4| X (x1X2x3X4) = 2}, and this may be abbreviated to A = {x,x2x3x4|X = 2}. Also,
we express Pr(A), in terms of the random variable X, as Pr(X = 2). So here we have
Pr(A) = Pr(X = 2) = 6/16 = 3/8. Similarly, it follows that Pr(X = 4) = 1/16 since
there is only one outcome for this case — namely, HHHH.
The following provides what we call the probability distribution for this particular ran-
dom variable X.
x Pr(X =x)
0 1/16
1 4/16 = 1/4
2 6/16 = 3/8
3 4/16 = 1/4
4 1/16
Observe how S45 Pr(X = x) = | in agreement with axiom (2) of Section 3.5. Also, it
is understood that Pr(X = x) = 0 for x #0, 1, 2, 3, 4.
Let us now reinforce what we have learned by considering a second example.
Suppose Giorgio rolls a pair of fair dice. This experiment was examined earlier —for
EXAMPLE 3.54
instance, in Examples 3.33 and 3.45. The sample space here comprises 36 ordered pairs
and may be expressed as F = {(x, y)|1 <x <6,1<y <6}.
We define the random variable X, for each ordered pair (x, y) in &, by X((x, y)) =
x + y, the sum of the numbers that appear on the (tops of) two fair dice. Then X takes on
the following values:
X((, 1)) =2
X (C1, 2)) = X((2, 1) = 3
X((1, 3)) = X((Q, 2)) = X(G3, D) =4
X((1, 4)) = X((2, 3)) = X(G, 2)) = X(4, 1) =5
X((1, 5)) = X((2, 4)) = X((G3, 3)) = X((4, 2)) = XS, 1) = 6
X (C1, 6)) = X((2, 5)) = X((3, 4)) = X((4, 3)) = XS, 2)) = X(6, 1) = 7
X((2, 6)) = X((3, 5)) = X(4, 4) = XS, 3)) = X((6, 2)) = 8
X((3, 6)) = X((4, 5)) = X(G5, 4)) = X((6, 3)) = 9
X ((4, 6)) = X((5, 5)) = X((6, 4)) = 10
X((5, 6)) = X((6, 5)) = 11
X((6, 6)) = 12
The probability distribution for X is as follows:
Pr(X = 2) = 1/36 Pr(X = 6) = 5/36 Pr(X = 10) = 3/36
Pr(X = 3) = 2/36 Pr(X = 7) = 6/36 Pr(X = 11) = 2/36
Pr(X = 4) = 3/36 Pr(X = 8) = 5/36 Pr(X = 12) = 1/36
Pr(X =5) = 4/36 Pr(X = 9) = 4/36
3.7. Discrete Random Variables (Optional) 177
This can be abbreviated somewhat by
x-—l
x =2,3,4,5,6,7
36”
Pr(X =x)=
12 — (x — 1)
x = 8,9, 10, 11, 12.
36
Note that )0'2., Pr(X =x) = 1.
Having finished with describing X and its probability distribution, now let us consider
the events:
B: Giorgio rolls an 8 — that is, the sum of the two dice is 8.
C: Giorgio rolls at least a 10.
The event B = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)} and Pr(B) = Pr(X = 8) = 5/36.
Meanwhile C = {(4, 6), (5, 5), (6, 4), (5, 6), (6, 5), (6, 6)} and Pr(C) = 6/36 = 3/36
+ 2/36 + 1/36 = Pr(X = 10) + Pr(X = 11) + Pr(X = 12) = Prl0<X <12)=
12g P(X =x) = Vo ny Pr(X = x).
The preceding two examples have shown us how a random variable may be described by
its probability distribution. Now we shall see how a random variable can be characterized
by means of two measures —its expected value, a measure of central tendency, and its
variance, a measure of dispersion.
When a fair coin is tossed 10 times, our intuition may suggest that we expect to get
five heads and five tails. Yet we know that we could actually see 10 heads, although the
probability for this outcome is only (i9y(2)"° = 3G57 = 0.000977, while the probability
for five heads and five tails is substantially higher as yayay = a = 0.246094.
Similarly, we may want to know how many times we might expect to see a 6 when a fair
die is rolled 50 times. To deal with such concerns we introduce the following idea.
Definition 3.14 Let X be a random variable defined for the outcomes in a sample space ¥. The mean, or
expected value, of X is
E(X) = > x- Pr(X =x),
where the sum is taken over all the values x determined by the random variable X*.
The following example deals with F(X) in several different situations.
T One finds the terms mean (value) and expectation also used to describe F(X), as well as the alternate notation
j4y.. Further, although our discussion deals solely with finite sample spaces, the above formula is valid for countably
infinite sample spaces, so long as the infinite sum converges.
178 Chapter 3 Set Theory
a) If a fair coin is tossed once and X counts the number of heads that appear, then
EXAMPLE 3.55
X(T) = 0, Pr(X =0) = Pr(X =1) =
wl co
Yy ={H,T}, X(H) = 1, 3
l 1 1 1
0-5 41-5
and = E(X) = Dox Pr(X=x)= 2
x=0
Note that £(X) is neither 0 nor 1.
b) If one fair die is rolled, then ¥ = {1, 2, 3, 4, 5, 6}. Further, for each 1 <i <6, we
have X(i) =i and Pr(X =i) = 1/6. So here
6 ] 1 1 ] 1 1
4.—E45: 5-=- E4+6°%
E(X)
= =
Dos » Prix
r(X =
=x) =|.- et2-E43-Zt4-
Q2.- 3.- 6:-
-(+ (l+2+4+ 46) = -!
6 6 2°
Note, once again, that E(X) is not among the values determined by the random vari-
able X.
c) Suppose now we have a loaded die, where the probability of rolling the number i
is proportional to i. As in part (b), Ff ={1, 2, 3, 4,5, 6} and X(@i) =i for 1 <i <6.
However, here, if p is the probability of rolling 1, then ip is the probability of roll-
ing i, for each of the other five outcomes /, where 2 <i <6. From axiom (2),
1= So, ip = pU+2+--:+6) =21p, so p=1/21 and Pr(X =1) = 1/21,
1 <i <6. Consequently,
1 2 6
215 42-54-46
E(X) = So x-Pr(X=x=1- 21
x=]
21
144494 164254+36 91 _ 13
21 213°
d) Consider the random variable X in Example 3.53, where a fair coin was tossed four
times. Then here
E(X) yx Pr(X =x) =0 +1] + +2 ——
6 +3 —
+ +4 __—_
16
. —— .V—_—
16
= xX —
16
»
16
=
~ 16
04441241244 _ 2.
16
In this case E(X) is found to be among the values determined by the random vari-
able X.
e) Finally, for Example 3.54, where Giorgio rolled a pair of fair dice, we find that
E(X) =2 4.3 * + +7 ° 48 > +]] 412 ,
36 36 36 36 36 36
252
= 2 =7,
36
Before continuing let us recall from Section 3.5 that a Bernoulli trial is an experiment
with exactly two outcom es
— success, with probability p, and failure, with probability
q = 1 — p. When such an experiment is performed n times, and the outcome of any one
3.7 Discrete Random Variables (Optional) 179
trial is independent of the outcomes of any previous trials, then the probability that there
are (exactly) k successes among the 77 trials is ({) pg”“*, O<k <n.
Now if we consider the sample space of all 2” possibilities for the n outcomes of these
n Bernoulli trials, then we can define the random variable X, where X counts the number
of successes among the n trials. Under these circumstances X is called a binomial random
variable and
Pr(X =x) = (")orar x=0,1,2,..
XxX
This probability distribution is called the binomial probability distribution and it is com-
pletely determined by the values of n and p. Further, it is precisely the type of probability
distribution that occurs in Example 3.53, where we regard an H as a success and find that
Pr(X =0) = = (3) (3) (3) Prk =3)= 4 (5) (3) (5)
rns ()(3) (3) rormo=e=()(5) (3)
{|
]
|
Pr(X =2)
5~() (3) (2)
Ir
I
The five previous results can be summarized by
4 1 x ] 4-x
Pr(iX =x)= (") (5) (5) , x=0,1,..., 4,
But why should we be bringing all of this up here in the discussion on the expected value of
a random variable? At this point notice that in part (d) of Example 3.55 we found E(X) = 2,
where X is the binomial random variable described above. For this binomial random variable
X wehaven = 4andp = 1/2. Is it justa coincidence here that F(X) = 2 = (4)(1/2) = np?
Suppose we were to roll a fair die 12 times and ask for the number of times we expect to
see a5 come up. Here the binomial random variable X would count the number of times a 5
is rolled among the 12 rolls. Our intuition might suggest the answer is 2 = (12)(1/6) = np.
But is this once again £(X) for this binomial random variable X ? Instead of verifying this
result directly — by using the formula in Definition 3.14, we shall obtain the result from
the following theorem.
THEOREM 3.11 Let X be the binomial random variable that counts the number of successes, each with
probability p, among n Bernoulli trials. Then E(X) = ap.
Proof: From Definition 3.14 we have
E(X) =o x- Pr(X=x)=)° “(") pq’
x=0 x=0
x
where g = 1 — p. Since x(") p* ‘q"~* = 0 when x = 0, it follows that
E(X)=) ox‘(ee Tones
. xX AX . n! xX AA K
x=] x=]
— X HX (n a 1)! xXx—-] n-x
=»x=] e's np x=1
(—-Din—x)!? 4
180 Chapter 3 Set Theory
n—|}
n—-l , ; os
np SO> , ( y i OY, upon substituting y = x — 1,
yao Yle — (y + DE and realizing that y varies from 0 to
n — | when x varies from | ton
n—]
—]
= np > (" y rae )-¥ = np(p+q)"|, by the binomial theorem
=np, sincep+gq=1.
As a result of Theorem 3.11 we now know that upon rolling a fair die 12 times the num-
ber of 5’s we expect to see is (12)(1/6) = 2, as our intuition suggested earlier. Better still,
should we roll this fair die 1200 times and let the random variable Y count the num-
ber of 5’s that appear, then Y is a binomial random variable with n = 1200, p = 1/6, and
Pr¥=y)= (70°) (2)? (gyre * y=0,1,2,..., 1200. Further, instead of trying to
determine E(Y) by actually calculating rai y(1200)(4) ( sy i200")*, we obtain E(Y) =
np = (1200)(1/6) = 200, quite readily from Theorem 3.11.
Having dealt with the concept of the mean, or expected value, of a random variable X,
we turn now to the variance of X —a measure of how widely the values determined by
X are dispersed or spread out. If X is the random variable defined on the sample space
Sx = {a, b,c}, where X(a) = —1, X(b) = 0, X(c) = 1, and Pr(X = x) = 1/3, forx =
~1,0, 1, then E(X) =0. But then if Y is the random variable defined on the sample
space fy = {r, s,t, u, v}, where Y(r) = —4, Y(s) = —2, Y(t) = 0, Y@) = 2, Y(v) =
and Pr(Y = y) = 1/5, for y = —4, —2, 0, 2, 4, we get the same mean — thatis, E(Y) =
However, although £(X) = E(Y), we can see that the values determined by Y are more
spread out about the mean of 0 than the values determined by X. To measure this notion of
dispersion we introduce the following.
Definition 3.15 Let & be the sample space for an experiment € and let X be a random variable defined on
the outcomes in ¥. Suppose further that E(X) is the mean, or expected value, of X. Then
the variance of X, denoted o7, or Var(X), is defined by
of = Var(X) = E(X — E(X))* = So — E(X))*?- Pr(X = x),
where the sum is taken over all the values of x determined by the random variable X.
The standard deviation of X, denoted cx, is defined by
x = V Var(X).
Now let us apply Definition 3.15 in the following.
Let X be the random variable defined on the outcomes of the sample space ¥ = {a, b, c, d},
EXAMPLE 3.56
with X(a) = 1, X(b) = 3, X(c) = 4, and X(d) =
Suppose the probability distribution for X is
x Pr(X =x)
1 1/5
3 2/5
4 1/5
6 1/5.
3.7 Discrete Random Variables (Optional) 18]
Then
1 2 17
E(X)
= 1-2 43. +4--46.-=
—
Mle
5 5
iil
and
Var(X) = E(X — E(X))?
0-90)
8) (3) 0
2)O)OGEOQ
SO
(2)(0)-8
Ox = V Var(X) =
661 .
5 = 5 ¥ 6 = 1.624808.
Our next result provides a second way by which we can compute Var(X).
THEOREM 3.12 If X is a random variable defined on the outcomes of a sample space
Y, then
Var(X) = E(X*) — [E(X)p.
Proof: From Definition 3.15 we know that
Var(X) = E(X — E(X))* = ¥ “(x — E(X))?- Pr(X = x).
Expanding within the summation we have
Var(X) = }°(x? — 2x E(X) + [E(X) 2)» Pr(X = x)
= $0 x? Pr(X =x) — 2E(X) Sox: Pr(X =x)
+[E(X)P > Pr(X =x), because E(X) is a constant
= E(x) - DE COE +[E(X)}’, because 5° x - Pr(X = x) = E(X)
and > Prix =x)=1
= E(X*) —[E(X)?.
Let us check the result for Var(X) in Example 3.56 by using Theorem
3.12.
182 Chapter 3 Set Theory
The information in Example 3.56 provides the following:
EXAMPLE 3.57
x? Pr(x =x)
aA pPWre&
1 1/5
9 2/5
16 1/5
36 1/5
So E(X*) = Yo, x? Pr(X = x) = (1) (5) +) (2) + 16) (5) + BO) (5) = F.
Earlier in Example 3.56 we learned that E(X) = 17/5. Consequently, from Theorem
3.12 we have Var(X) = E(X*) —[E(X)? = 2 - (22)° = (54) (355 — 289) = &, as we
found earlier.
We'll use the formula of Theorem 3.12 a second time in the following.
EXAMPLE 3.58| In Example 3.53 we studied the random variable X, which counted the number of heads
that result when a fair coin is tossed four times. Soon thereafter we learned that X was
a binomial random variable with n = 4, p= 1/2,g =1—p=1/2, and Pr(X =x)=
(3) (GG) (4) ,x = 0, 1, 2, 3,4. Further, in part (d) of Example 3.55 we found that
E(X) = 2 (= np, as we learned later in Theorem 3.11). To compute Var(X) we use the
formula in Theorem 3.12, but first we consider the following.
x x? Pr(X =x)
0 1/16
WN ©
l 4/16
4 6/16
9 4/16
4 16 1/16
Using these results we find that E(X*) = }°4_, x? Pr(X =x) =0- a 41. 4 44.816
+
9-4 416-4 = 8 =5. So Var(X)= E(X”) — [E(X)P
= 5 - (2)? = 1 = 4 (5) (4)
Il
npq, aresult that is true in general. Further, for this random variable, the standard deviation
ox = 1.
As we mentioned, the preceding example contains an instance of a more general result.
We state that result now in our next theorem and outline a proof for this theorem in the
Section Exercises.
THEOREM 3.13 Let X be the binomial random variable that counts the number of successes, each with
probability p, among n independent Bernoulli trials. Then Var(X) = npq andoy = ./npq,
where g = 1 — p.
As aresult of Theorems 3.11 and 3.13 we now find that our next example requires little
calculation.
Due to top-notch recruiting, Coach Jenkins’ baseball team has probability 0.85 of winning
EXAMPLE 3.59
each of the 12 baseball games it will play during the spring semester. (Here the outcome of
each game is independent of the outcome of any previous game.)
3.7 Discrete Random Variables (Optional) 183
Let X be the random variable that counts the number of games Coach Jenkins’
team wins during the spring semester. Then Pr(X = x) = (17)(0.85)*(0.15)!?™,
x =0,1,2,...,12. Further, with n=12 and p=0.85, we readily see that
E(X) = 012, x(12)(0.85)* (0.15)!2-* = np = 12(0.85) = 10.2 and Var(X) =
22 o(x — 10.2)7(17)(0.85)* (0.15)!2-* = SO?9 x? (17) (0.85)* (0.15)!2-* — (10.2)? =
npq = (12)(0.85)(0.15) = 1.53.
A word of warning! The preceding example shows how easy it is to compute £(X)
and Var(X) for a binomial random variable X, once we know the values of n and p. But
remember, the formulas in Theorems 3.11 and 3.13 are valid only when the random variable
X is binomial.
Before we introduce the last idea for this section we shall consider an example in order
to motivate and illustrate the idea.
Referring back to Example 3.59, at this point we want to determine a /ower bound for
EXAMPLE 3.60
the probability that the random variable X is within & standard deviations oy of the mean
E(X), for k = 2,3. When k = 2 we find that Pr(E(X) — 2o0y < X < E(X) + 20x) =
Pr(|X — E(X)| < 20x). From the calculations in Example 3.59 we know that E(X) =
10.2 and Var(X) = 1.53, so oy = V7 1.53 = 1.236932. Consequently, Pr(|X — E(X)| <
2ox) = Pr(10.2 — 2(1.236932) < X < 10.2 +. 2(1.236932)) = Pr(7.726136 < X <
12.673864) = Pr(X = 8) + Pr(X =9)+---+ Pr(X = 12) =
22, (17) (0.85)* (0.15)'2-* = 0.068284 + 0.171976 + 0.292358 + 0.301218 +
0.142242 = 0.976078.
Likewise, for k = 3, Pr(|X — E(X)| < 30x) = Pr(6.489204 < X < 13.910796) =
Pr(X = 7) + Pr(X = 8)+---+ Pr(X = 12) = 0.019280 + 0.068284+ --- +
0.142242 = 0.995358.
But where is this lower bound that we mentioned at the start of our discussion? Looking
at the results for k = 2, 3 once more, we see that Pr(|X — E(X)| < 20x) = 0.976078 >
3 = 1— 4 and Pr(|X — E(X)| <3ox) = 0.995358 > § = 1— 4. So
Pr(|X — EQO| Skow) = 1-5, fork = 2, 3.
Further, although this lower bound is on the crude side, our next result will show that it is
true for any positive real number k. In addition, the result is true for any random variable
X, not just a binomial random variable like the one we have used here.
THEOREM 3.14 Chebyshev’s Inequality. Let Ff be the sample space for an experiment @ and let X be a
random variable defined on the outcomes in &. If E(X) is the mean of X and oy its
standard deviation, then for any k > 0,
1
Pr(E(X) — koxy < X < E(X) + kox) = Pr(|X — E(X)| < kox) = 1 - Be
[Here, as in Example 3.60, X accounts for those x values where x = X(s) for some s € F
and |x — E(X)| < koy.]
Proof: The proof presented here is for X discrete.’ However, the result is also true for
continuous random variables.
"The proof presented here is valid for the case where the sample space is countably infinite, so long as all the
summations converge.
184 Chapter 3 Set Theory
Let A, B be the following subsets of R.
A= {x||x — E(X)|>kox} B= {x||x — E(X)|
< kox}
(Note that A, B are not necessarily events for they need not be subsets of . They are
subsets of the set of real numbers determined by the random variable X.)
We know that
Var(X) = 02 = xe: — E(X))*Pr(X = x)
x
= Se = E(X)) Pr(X =x) + Sor = E(X)YPPr(X = x)
xEA xeB
> SOx — E(X))Pr(X =x), as So — E(X))P Pr(X = x) > 0.
xEA xEB
For x € A, |x — E(X)| >koy and so it follows that here |x — E(X)| > koy. Since
(x — E(X))? = |x — E(X)|? we now have
og > S° |x — E(X)PPr(X = x) > Po? y- Pr(X =x), and
xEA xEA
oy > Pog S > Pr(X = x) > of > of Pr(\X — E(X)| > kox)
xEA
1 l
> aq = Pr(lX — E(X)| > kox) = - < ~Pr(|\X ~ E(X)| > koy)
1
= 1-5 <1 — Pri|X ~ E(X)| > kox)
1
= 1-5 < Pr(|X — EQX0)| <kox).
Our last example for this section shows how one might apply Chebyshev’s Inequality.
EXAMPLE 3.61 ina ieis selling
Angelica cellj boxes : of candy for h her choir’s
tea ve
Christmas : iser.
fund raiser. T :
The pieces :
of candy
are packed into each box so that the mean number of pieces is 125 with a standard deviation
of 5 pieces. To find a lower bound on the probability that a box of Angelica’s candy contains
between 118 and 132 pieces we proceed as follows.
Here the random variable X counts the number of pieces of candy in a box, with E(X) =
125 and ox = 5. Applying Chebyshev’s Inequality we have
Pr(118 < X < 132) = Pr(118— 125 < X — 125 < 132 — 125)
= Pr(—7 < X — 125 <7) = Pr{|X — 125| <7)
7 | 25 24
pr (ix~ecoi< (2) ox) > 1-35 =1- 3 = 3
5
Consequently, the probability that a box of Angelica’s candy contains between 118 and
132 pieces is at least 24/49 = 0.489796. (Note here that the value of k in Chebyshev’s
Inequality is 7/5, which is not an integer.)
3.7 Discrete Random Variables (Optional) 185
10. A carnival game invites a player to select one card from a
EXERCISES 3.7 standard deck of 52 cards. If the card is a seven or a Jack the
player is given five dollars. For a king or an ace the player is
1, Let X be a random variable with the following probability given eight dollars. The other 36 cards result in the player los-
distribution. ing. How much should one be willing to pay to play this game so
x |0O 1 2 3 4 that it is fair— that is, so that the expected value of the player’s
net winnings is 0?
i 1 1
prix=x) | } 4 4 4 8 11. The route that Jackie follows to school each day includes
Determine (a) Pr(X = 3); (b) Pr(X <4); (c) Pr(X > 0); eight stoplights. When she reaches each stoplight, the proba-
(d) Pr(l < X < 3); (e) Pr(X = 2|X < 3); and bility that the stoplight is red is 0.25 and it is assumed that the
(f) Pr(X < lorX = 4). stoplights are spaced far enough apart so as to operate indepen-
2. The probability distribution for a random variable X dently. If the random variable X counts the number of red stop-
is given by Pr(X =x) = (3x 4+ 1)/22, x =0, 1, 2, 3. De- lights Jackie encounters one particular day on her ride to school,
termine (a) Pr(X = 3); (b)Pr(X <1); (c) Prd < X < 3); determine (a) Pr(X = 0); (b) Pr(X = 3); (c) Pr(X > 6);
(d) Pr(X > —2); and (e) Pr(X = 1|X <2). (d) Pr(X > 6|X > 4); (e) E(X); and (f) Var(X).
3. Ashipment of 120 graphics cards contains 10 that are defec- 12. Suppose that a random variable X has mean E(X) = 17
tive. Serena selects five of these cards, without replacement, and and variance Var(X) = 9, but its probability distribution is
inspects them to see which, if any, are defective. If the random unknown. Use Chebyshev’s Inequality to estimate a lower
variable X counts the number of defective graphics cards in Ser- bound for (a) Pr(11 < X < 23); (b) Pr(l0< X < 24); and
ena’s selection, determine (a) Pr(X = x), x =0,1,2,...,5; (c) Pr(8 < X < 26).
(b) Pr(X = 4); (c) Pr(X > 4); and (d) Pr(X = 1|X <2). 13. Suppose that a random variable X has mean E(X) = 15
4. Connie tosses a fair coin three times. If X = X, — X2, and variance Var(X) = 4, but its probability distribution is un-
where X, counts the number of heads that result and X> counts known. Use Chebyshev’s Inequality to find the value of the
the number of tails that result, determine (a) the probability dis- constant c where Pr(|X — 15| <c) > 0.96.
tributions for X,, X2, and X; and (b) the means E(X,), E(X2), 14. Fred rolls a fair die 20 times. If X is the random variable
and E(X). that counts the number of 6’s that come up during the 20 rolls,
5. Let X be the random variable where Pr(X = x) = 1/6 determine F(X) and Var(X).
for x = 1,2,3,...,6. (Here X is a uniform discrete ran- 15. Acarton contains 20 computer chips, four of which are de-
dom variable.) Determine (a) Pr(X > 3); (b) Pr(2 < X <5); fective. Isaac tests these chips — one at a time and without re-
(c) Pr(X = 4|X > 3); (d) E(X); and (e) Var(X). placement — until he either finds a defective chip or has tested
6. Acomputer dealer finds that the number of laptop comput- three chips. If the random variable X counts the number of
ers her dealership sells each day is a random variable X where chips Isaac tests, find (a) the probability distribution for X;
the probability distribution for X is given by (b) Pr(X <2); (c) Pr(X = 1X <2); (d) E(X); and
(e) Var(X).
cx?
x=1,2,3,4,5 16. Suppose that X is a random variable defined on a sample
Pr(X =x)= ¢ x!’
space ¥ and that a, b are constants. Show that (a) E(aX + b) =
0, otherwise,
aE(x) +b and (b) Var(aX + b) = a?Var(X).
where c is a constant. Determine (a) the value of c;
17, Let X be a binomial random variable with Pr(X = x) =
(b) Pr(X >3); (c) Pr(X =4|X >3); (d) E(X); and
(")p*q"-*, x =0, 1, 2,...,”, wheren (> 2) is the number of
(e) Var(X).
Bernoulli trials, p is the probability of success for each trial,
7, Arandom variable X has probability distribution given by andg = 1—p.
c(6—x), x =1,2,3,4,5 a) Show that E(X(X — 1)) =n? p* = np’.
Pr(xX =x)=
0, otherwise, b) Using the fact that E(X(X — 1)) = E(X?— X)=
where c is a constant. Determine (a) the value of c; E(X?) — E(X) and that E(X) = np, show that Var(X) =
(b) Pr(X < 2); (c) E(X); and (d) Var(X). npq.
8. Wayne tosses an unfair coin— one that is biased so that a 18. In alpha testing a new software package, a software engi-
head is three times as likely to occur as a tail. How many heads neer finds that the number of defects per 100 lines of code is a
should Wayne expect to see if he tosses the coin 100 times? random variable X with probability distribution:
9, Suppose that X is a binomial random variable where x | 1 2 3 4
Pr(X =x) = (*)p?— py",x =0,1,2,..., 0. 1f
E(X) = 70 and Var(X) = 45.5, determine n, p.
Pr(X=x)|04 03 02 0.1
186 Chapter 3 Set Theory
Find (a) Pr(X > 1); (b) Pr(X = 3|X > 2); (c) E(X); and 20. An assembly comprises three electrical components that
(d) Var(X). operate independently. The probabilities that these components
19. In Mario Puzo’s novel The Gedfather, at the wedding recep- function according to specifications are 0.95, 0.9, and 0.88. If
tion for his daughter Constanzia, Don Vito Corleone discusses the random variable X counts the number of components that
with his godson Johnny Fontane how he will deal with the movie function according to specifications, determine (a) the proba-
mogul Jack Woltz. And in this context he speaks the famous line bility distribution for X; (b) Pr(X > 2|X > 1); (c) E(X); and
(d) Var(X).
“T’ll make him an offer he can’t refuse.”
21. An urn contains five chips numbered 1, 2, 3, 4, and 5. When
If we let the random variable X count the number of letters two chips are drawn (without replacement) from the urn, the
and apostrophes in a randomly selected word (from the above random variable X records the higher value. Find E(X) andoy.
quotation) and we assume that each of the eight words has the
same probability of being selected, determine (a) the probability
distribution for X; (b) E(X); and (c) Var(X).
3.8
Summary and Historical Review
In this chapter we introduced some of the fundamentals of set theory, together with certain
relationships to enumeration problems and probability theory.
The algebra of set theory evolved during the nineteenth and early twentieth centuries.
In England, George Peacock (1791—1858) was a pioneer in mathematical reforms and was
among the first, in his Treatise on Algebra, to revolutionize the entire conception of algebra
and arithmetic. His ideas were further developed by Duncan Gregory (1813-1844), William
Rowan Hamilton (1805-1865), and Augustus DeMorgan (1806-1871), who attempted to
remove ambiguity from elementary algebra and cast it in the strict postulational form.
Not until 1854, however, when Boole published his /nvestigation of the Laws of Thought,
was an algebra dealing with sets and logic formalized and the work of Peacock and his
contemporaries extended.
The presentation here is primarily concerned with finite sets. However, the investigation
of infinite sets and their cardinalities has occupied the minds of many mathematicians and
philosophers. (More about this can be found in Appendix 3. However, the reader may
want to learn more about functions —as presented in Chapter 5 — before looking into the
material in this appendix.) The intuitive approach to set theory was taken until the time of
the Russian-born mathematician Georg Cantor (1845-1918), who defined a set, in 1895,
in a way comparable to the “gut feeling” we mentioned at the start of Section 3.1. His
definition, however, was one of the obstacles he was never able to entirely remove from his
theory of sets.
In the 1870s, when Cantor was researching trigonometric series and series of real num-
bers, he needed a device to compare the sizes of infinite sets of numbers. His treatment of
the infinite as an actuality, on the same level as the finite, was quite revolutionary. Some of
his work was rejected because it proved to be much more abstract than what many mathe-
maticians of his time were accustomed to. However, his work won wide enough acceptance
so that by 1890 the theory of sets, both finite and infinite, was considered a branch of
mathematics in its own right.
By the turn of the century the theory was widely accepted, but in 1901 the paradox
now known as Russell’s paradox (which was discussed in Exercise 27 of Section 3.1)
showed that set theory, as originally proposed, was internally inconsistent. The difficulty
seemed to be in the unrestricted way in which sets could be defined; the idea of a set’s being a
3.8 Summary and Historical Review 187
Georg Cantor (1845-1918)
Reproduced courtesy of The Granger Collection, New York
member of itself was considered particularly suspect. In their work Principia Mathematica,
the British mathematicians Lord Bertrand Arthur William Russell (1872-1970) and Alfred
North Whitehead (1861-1947) developed a hierarchy in the theory of sets known as the
theory of types. This axiomatic set theory, among other twentieth-century formulations,
avoided the Russell paradox. In addition to his work in mathematics, Lord Russell wrote
books dealing with philosophy, physics, and his political views. His remarkable literary
talent was recognized in 1950 when he was awarded the Nobel prize for literature.
Lord Bertrand Arthur William Russell (1872-1970)
The discovery of Russell’s paradox — even though it could be remedied — had a pro-
found impact on the mathematical community, for many began to wonder if other contra-
dictions were still lurking. Then in 1931 the Austrian-born mathematician (and logician)
Kurt Gédel (1906-1978) formulated that “under a specified consistency condition, any
Chapter 3 Set Theory
sufficiently strong formal axiomatic system must contain a proposition such that neither it
nor its negation is provable and that any consistency proof for the system must use ideas and
methods beyond those of the system itself.” And unfortunately, from this we learn that we
cannot establish —-in a mathematically rigorous manner — that there are no contradictions
in mathematics. Yet despite “Gédel’s proof,” mathematical research continues on — in fact,
to the point where the amount of research since 1931 has surpassed that in any other period
in history.
The use of the set membership symbol ¢€ (a stylized form of the Greek letter epsilon)
was introduced in 1889 by the Italian mathematician Giuseppe Peano (1858-1932). The
symbol “e” ts an abbreviation for the Greek word “eo tz” meaning “is.”
The Venn diagrams of Section 3.2 were introduced by the English logician John Venn
(1834-1923) in 1881. In his book Symbolic Logic, Venn clarified ideas previously devel-
oped by his countryman George Boole (1815-1864). Furthermore, Venn contributed to the
development of probability theory — as described in the widely read textbook he wrote on
this subject. The Gray code, which we used in Section 3.1 to store the subsets of a finite set
as binary strings, was developed in the 1940s by Frank Gray at the AT&T Bell Laborato-
ries. Originally, such codes were used to minimize the effect of errors in the transmission
of digital signals.
If we wish to summarize the importance of the role of set theory in the development of
twentieth-century mathematics, the following quote attributed to the German mathematician
David Hilbert (1862-1943) is worth pondering: “No one shall expel us from the paradise
which Cantor has created for us.”
In Section 3.1 we mentioned the array of numbers known as Pascal’s triangle. We could
have introduced this array in Chapter 1 with the binomial theorem, but we waited until we
had some combinatorial identities that we needed to verify how the triangle is constructed.
The array appears in the work of the Chinese algebraist Chu Shi-kie (1303), but its first
appearance in Europe was not until the sixteenth century, on the title page of a book by
Petrus Apianus (1495-1552). Niccolo Tartaglia (1499-1559) used the triangle in computing
powers of (x + y). Because of his work on the properties and applications of this triangle,
the array has been named in honor of the French mathematician Blaise Pascal (1623-1662).
Although probability theory originated with games of chance and enumeration problems,
we included it here because set theory has evolved as the exact medium needed to state
and solve problems in this important contemporary area of applied mathematics. In the
decade following 1660, probability entered European thought as a way of understanding
stable frequencies in random processes. Ideas, which exemplify this consideration, were put
forth by Blaise Pascal, and these led to the first systematic treatise on probability, written in
1657 by Christian Huygens (1629-1695). In 1812 Pierre-Simon de Laplace (1749-1827)
collected all the ideas developed on probability theory at that time —- starting with the def-
inition in which each individual outcome is equally likely— and published them in his
Analytic Theory of Probability. Among other ideas, this text includes the Central Limit
Theorem —a fundamental.-result at the heart of hypothesis testing (in statistics). Along
with Pierre-Simon de Laplace, Thomas Bayes (1702-1761) also showed how to determine
probabilities by examining certain empirical data. Bayes’ Theorem honors the name of this
English Presbyterian minister and mathematician, Chebyshev’s Inequality (of Section 3.7)
is named for the Russian mathematician Pafnuty Lvovich Chebyshev (1821-1894), who
may be better remembered for his work in number theory and interest in mechanics. Finally,
the axiomatic approach to probability was first given in 1933 by the Russian mathemati-
cian Andrei Nikolayevich Kolmogorov (1903-1987) in his monograph Grundbegriffe der
Wahrscheinlichkeitsrechnung (Foundations of the Theory of Probability).
Supplementary Exercises 189
More on the history and development of set theory can be found in Chapter 26 of
C. B. Boyer [1]. Formal developments of set theory, including results on infinite sets, can
be found in H. B. Enderton [3], P. R. Halmos [4], J. M. Henle [5], and P. C. Suppes [8]. An
interesting history of the origins of probability and statistical ideas, up to the Newtonian
era, can be found in F. N. David [2]. A more contemporary coverage is given in the text
by V. J. Katz [6]. Chapters 1 and 2 of J. J. Kinney [7] are an excellent source for those
interested in learning more about discrete probability.
Andrei Nikolayevich Kolmogorov (1903-1987) Thomas Bayes (1702-1761)
REFERENCES
. Boyer, Carl B. History of Mathematics. New York: Wiley, 1968.
=
. David, Florence Nightingale. Games, Gods, and Gambling. New York: Hafner, 1962.
WN
. Enderton, Herbert B. Elements of Set Theory. New York: Academic Press, 1977.
. Halmos, Paul R. Naive Set Theory. New York: Van Nostrand, 1960.
Henle, James M. An Outline of Set Theory. New York: Springer-Verlag, 1986.
. Katz, Victor J. A History of Mathematics (An Introduction). New York: Harper Collins, 1993.
. Kinney, John J. Probability: An Introduction with Statistical Applications. New York: Wiley,
1997.
8. Suppes, Patrick C. Axiomatic Set Theory. New York: Van Nostrand, 1960.
a) A-C=B-C>A=B
SUPPLEMENTARY EXERCISES b) (ANC =BNC)A(A-C=B-C)|SA=B
©) (AUC =BUC)A(A—-C=B-C)|S4=B
1. Let A, B, C CU. Prove that (A — B) CC if and only if oo, ; ;
(A—C)CB. 4. a) For positive integers m, n, r, withr < min{m, n}, show
; that
ER -alaeay — CPOOMALOC
2. Give a combinatorial argument to show that for integers
r
1
3. Let A, B, CCU. Prove or disprove (with a counter- eee
example) each of the following:
190 Chapter 3 Set Theory
b) For 7 a positive integer, show that rows of the table for which this is ttue—rows }, 2, and 4, as
indicated by the arrows. For these rows, the columns for B and
O)-E()
n = k
AU B are exactly the same, so this membership table shows
that ACB>AUB=B.
5. a) In how many ways can a teacher divide a group of seven
students into two teams each containing at least one stu- Table 3.7
dent? two students? A|B|AUB
b) Answer part (a) upon replacing seven with a positive
integer n > 4. > | 0 0 0
> 0 1 1
6. Determine whether each of the following statements is true
j 0 1
or false. For each false statement, give a counterexample.
> \ | 1
a) If A and B are infinite sets, then AQ B is infinite.
b) If B is infinite and A C B, then A is infinite. Use membership tables to verify each of the following:
c) If A C B with B finite, then A is finite. a) AC BS>ANB=A
d) If A C B with A finite, then B is finite. b) (AN B=A)A(BUCH=C)JSBAUBUCHC
7, Aset A has 128 subsets of even cardinality. (a) How many ec) COBCAS(ANB)U(BNC)=ANC
subsets of A have odd cardinality? (b) What is | A|?
dMd@AAB=CSBAAC=BandBACH=A
8. LetA = {1, 2, 3,..., 15}. 14. State the dual of each theorem in Exercise 13. (Here you
a) How many subsets of A contain all of the odd integers will want to use the result of Example 3.19 in conjunction with
in A? Theorem 3.5.)
b) How many subsets of A contain exactly three odd 15. a) Determine the number of linear arrangements of m 1’s
integers? and r 0’s with no adjacent 1’s. (State any needed condi-
c) How many eight-element subsets of A contain exactly tion(s) for m, r.)
three odd integers? b) If%U = {1, 2,3,..., 2}, how many sets A C U are such
d) Write a computer program (or develop an algorithm) to that |A| = k with A containing no consecutive integers?
generate a random eight-element subset of A and have it [State any needed condition(s) for n, k.]
print out how many of the eight elements are odd. 16. If the letters in the word BOOLEAN are arranged at ran-
9. Let A, B, C CU. Prove that dom, what is the probability that the two O’s remain together in
the arrangement?
(AN B)UC=AN(B
UC) if and only if C CA.
17. At a high school science fair, 34 students received awards
10. Let U be a given universe with A, B CU, |AN B| =3, for scientific projects. Fourteen awards were given for projects
|A U B| = 8, and | | = 12. in biology, 13 in chemistry, and 21 in physics. If three students
a) How many subsets CCU satisfy ANBCCC received awards in all three subject areas, how many received
AU B? How many of these subsets C contain an even awards for exactly (a) one subject area? (b) two subject areas?
number of elements? 18. Fifty students, each with 75¢, visited the arcade of Example
b) How many subsets DC UW satisfy AUBCDC 3.27. Seventeen of the students played each of the three com-
A U B? How many of these subsets D contain an even puter games, and 37 of them played at least two of them. No
number of elements? student played any other game at the arcade, nor did any student
11, Let% = Rand let the index set / = Q*. Foreachg € Qt, play a given game more than once. Each game costs 25¢ to play,
let A, = [0, 2g] and B, = (0, 3q). Determine and the total proceeds from the student visit were $24.25. How
many of these students preferred to watch and played none of
a) Aq b) Ay A By
the games?
ec) UA,
gél
d) MB,gél
19. In how many ways can 15 laboratory assistants be assigned
to work on one, two, or three different experiments so that each
12. For a universe U and sets A, B CU, prove that
experiment has at least one person spending some time on it?
a) A AB=BAA b)AAA=%U
20. Professor Diane gave her chemistry class a test consisting
ec) AAU=A of three questions. There are 21 students in her class, and ev-
d) A A 4= A, so is the identity for A, as well as for U ery student answered at least one question. Five students did
13. Consider the membership table (Table 3.7). If we are given not answer the first question, seven failed to answer the second
the condition that A € B, then we need consider only those question, and six did not answer the third question. If nine stu-
Supplementary Exercises 191
dents answered all three questions, how many answered exactly the plane to land safely, all three landing gears (the nose and
one question? both wing landing gears) must have at least one good tire. What
21. Let U be a given universe with A, B CU, ANB=4, is the probability that the jet will be able to land safely even on
|A| = 12, and |B| = 10. If seven elements are selected from a hard landing?
AUB, what is the probability the selection contains four 32. Let & be the sample space for an experiment © and let
elements from A and three from B? A, B be events — that is, A, B CY. Prove that Pr(A NM B) >
22. For a finite set A of integers, let o(A) denote the sum of Pr(A)+ Pr(B)— 1. (This result is known as Bonferroni's
the elements of A. Then if Ul is a finite universe taken from Inequality.)
Z*, Dacwyyo (A) denotes the sum of all elements of all sub- 33. The exit door at the end of a hallway is open half of the time.
sets of U. Determine L4cgmqya (A) for On a table by the entrance to this hallway is a box containing 10
a) U = {1, 2, 3} b) U = {1, 2, 3, 4} keys, but only one of these keys opens the exit door at the end
of the hallway. Upon entering the hallway Marlo selects two of
c) UW = {1, 2, 3,4, 5} d) U={1,2,3,...,7} the keys from the box. What is the probability she will be able
e) U = {a), do, a3, ..., a,}, where to leave the hallway via the exit door, without returning to the
S=a+4,+4,+°-++4, box for more keys?
23. a) In chess, the king can move one position in any direc- 34, Dustin tosses a fair coin eight times. Given that his first and
tion. Assuming that the king is moved only in a forward last outcomes are the same, what is the probability he tossed
manner (one position up, to the right, or diagonally north- five heads and three tails?
east), along how many different paths can a king be moved
35. The probability Coach Sears’ basketball team wins any
from the lower-left corner position to the upper-right corner
given game is 0.8, regardless of any prior win or loss. If her
position on the standard 8 X 8 chessboard?
team plays five games, what is the probability it wins more
b) For the paths in part (a), what is the probability that a games than it loses?
path contains (1) exactly two diagonal moves? (ii) exactly
36. Suppose that the number of boxes of cereal packaged each
two diagonal moves that are consecutive? (ili) aneven num-
day at a certain packaging plant is a random variable — call it
ber of diagonal moves?
X — with E(X) = 20,000 boxes and Var(X) = 40,000 boxes’.
24, Let A, BCR, where A = {x|x? — 7x = —12} and B = Use Chebyshev’s Inequality to find a lower bound on the prob-
{x|x? — x = 6}. Determine A U B and AN B. ability that the plant will package between 19,000 and 21,000
25. Let A, BCR, where A = {x|x? —7x < —12} and B= boxes of cereal on a particular day.
{x|x* — x < 6}. Determine A U B and AN B. 37. Find the probability of getting one head (exactly) two times
26. Four torpedoes, whose probabilities of destroying an en- when three fair coins are tossed four times.
emy ship are 0.75, 0.80, 0.85, and 0.90, are fired at such a 38. Devon has a bag containing 22 poker chips — eight red,
vessel. Assuming the torpedoes operate independently, what is eight white, and six blue. Aileen reaches in and withdraws
the probability the enemy ship is destroyed? three of the chips, without replacement. Find the probability
27. Travis tosses a fair coin twice. Then he tosses a biased coin, that Aileen has selected (a) no blue chips; (b) one chip of each
one where the probability of a head is 3/4, four times. What is color; or (c) at least two red chips.
the probability Travis’s six tosses result in five heads and one 39, Let X be a random variable with probability distribution
tail?
28. Let ¥ be the sample space for an experiment ©, with events c(x2 +4), x =0,1,2,3,4
Pr(X =x) =
A, B CY. Prove that ; otherwise,
Pr(A)+ Pr(B)-1
Pr(A|B) = Pr(B) where c is a constant. Determine (a) the value of c;
(b) Pr(X > 1); (c) Pr(X =3|X > 2); (d) E(X); and
29. Let A, B, C be independent events taken from a sample (e) Var(X).
space *. Prove that the events A and B U C are independent. 40. Adozen urns each contain four red marbles and seven green
30. What is the minimum number of times we must toss a fair ones. (All 132 marbles are of the same size.) If a dozen students
coin so that the probability that we get at least two heads is at each select a different urn and then draw (with replacement)
least 0.95? five marbles, what is the probability that at least one student
31. Alarge jet aircraft has two wheels per landing gear for added draws at least one red marble?
safety. The tires are rated so that even with a “hard landing” the 41. Maureen draws five cards from a standard deck: the 6 of di-
probability of any single tire blowing out is only 0.10. (a) What amonds, 7 of diamonds, 8 of diamonds, jack of hearts, and king
is the probability that a landing gear (with two tires) will survive of spades. She discards the jack and king and then draws two
even a hard landing with at least one good tire? (b) In order for cards from the remaining 47. What is the probability Maureen
192 Chapter 3 Set Theory
finishes with (a) a straight flush; (b) a flush (but not a straight 44. A fair die is rolled three times and the random variable X
flush); and (c) a Straight (but not a straight flush)? records the number of different outcomes that result. For exam-
42. Inthe game of pinochie the deck consists of 48 cards — two ple, if two 5’s and one 4 are rolled, then X records two differ-
each of the 9, 10, jack, queen, king, and ace for each of the four ent outcomes, Determine (a) the probability distribution for X,
suits. There are four players and each is dealt 12 cards. What is (b) E(X); and (c) Var(X).
the probability a given player is dealt four kings (one of each 45, When a coin is tossed three times, for the outcome HHT
suit), four queens (one of each suit), and four other cards none we say that two runs have occurred — namely, HH and T. Like-
of which is a king or queen? (Such a hand is referred to as a wise, for the outcome THT we find three runs: T, H, and T.
bare roundhouse.) (The notion of a run was first introduced in Example 1.41.)
43. A grab bag contains one chip with the number 1, two chips Now suppose a biased coin, with Pr(H) = 3/4, is tossed three
each with the number 2, three chips each with the number times and the random variable X counts the number of runs
3,..., and » chips each with the number n, where n € Z*. that result. Determine (a) the probability distribution for X;
All chips are of the same size, those numbered | to m are red, (b) E(X); and (c) ox.
and those numbered m+ 1! to ” are blue, where m € Z* and
m <n. If Casey draws one chip, what is the probability it is the
chip with 1 on it, given that the chip is red?
Properties of
the Integers:
Mathematical
Induction
He: known about the integers since our first encounters with arithmetic, in this chapter
we examine a special property exhibited by the subset of positive integers. This property
will enable us to establish certain mathematical formulas and theorems by using a technique
called mathematical induction. This method of proof will play a key role in many of the
results we shall obtain in the later chapters of this text. Furthermore, this chapter will provide
us with an introduction to five sets of numbers that are very important in the study of discrete
mathematics and combinatorics — namely, the triangular numbers, the harmonic numbers,
the Fibonacci numbers, the Lucas numbers, and the Eulerian numbers.
When x, y € Z, we know that x + y, xy, x — y € Z. Thus we say that the set Z is
closed under (the binary operations of) addition, multiplication, and subtraction. Turning
to division, however, we find, for example, that 2, 3 € Z but that the rational number 4 is
not a member of Z. So the set Z of all integers is not closed under the binary operation
of nonzero division. To cope with this situation, we shall introduce a somewhat restricted
form of division for Z and shall concentrate on special elements of Z* called primes. These
primes turn out to be the “building blocks” of the integers, and they provide our first example
of a representation theorem — in this case the Fundamental Theorem of Arithmetic.
4}
The Well-Ordering Principle:
Mathematical Induction
Given any two distinct integers x, y, we know that we must have either x < y or y < x.
However, this is also true if, instead of being integers, x and y are rational numbers or real
numbers. What makes Z special in this situation?
Suppose we try to express the subset Z* of Z, using the inequality symbols > and >.
We find that we can define the set of positive elements of Z as
Zt = {x €Z|x > 0} = {x €Z|x > 1}.
193
194 Chapter 4 Properties of the Integers: Mathematical Induction
When we try to do likewise for the rational and real numbers, however, we find that
Qt = {x €Q\x > 0} and Rt = {x Ee R|x > 0},
but we cannot represent Q* or Rt using > as we did for Z*.
The set Z* is different from the sets Q* and R* in that every nonempty subset X of
Z* contains an integer a such that a < x, for all x €¢ X —that is, X contains a least (or
smallest) element. This is not so for either Qt or Rt. The sets themselves do not contain least
elements. There is no smallest positive rational number or smallest positive real number. If
q is a positive rational number, then since 0 < g/2 <q, we would have the smaller positive
rational number gq /2.
These observations lead us to the following property of the set Z* C Z.
The Well-Ordering Principle: Every nonempty subset of Z* contains a smallest
element. (We often express this by saying that Z* is well ordered.)
This principle serves to distinguish Z* from Q* and R*. But does it lead anywhere that
is mathematically interesting or useful? The answer is a resounding “Yes!” It is the basis
of a proof technique known as mathematical induction. This technique will often help us to
prove a general mathematical statement involving positive integers when certain instances
of that statement suggest a general pattern.
We now establish the basis for this induction technique.
THEOREM 4.1 The Principle of Mathematical Induction. Let S(n) denote an open mathematical statement
(or set of such open statements) that involves one or more occurrences of the variable n,
which represents a positive integer.
a) If S(1) is true; and
b) If whenever S(k) is true (for some particular, but arbitrarily chosen, k €¢ Z*), then
S(k + 1) is true;
then S(n) is true for alln € Z*.
Proof: Let S(n) be such an open statement satisfying conditions (a) and (b), and let F =
{t € Z*|S(t) is false}. We wish to prove that F = @, so to obtain a contradiction we assume
that F # 9. Then by the Well-Ordering Principle, F has a least element m. Since S(1)
is true, it follows that m # 1, so m > 1, and consequently m — 1 ¢ Zt. Withm —1¢ F,
we have S(m — 1) true. So by condition (b) it follows that S((# — 1) + 1) = S(m) 1s true,
contradicting m € F. This contradiction arose from the assumption that Ff # @. Conse-
quently, F = @.
We have now seen how the Well-Ordering Principle is used in the proof of the Principle of
Mathematical Induction. It is also true that the Principle of Mathematical Induction is useful
if one wants to prove the Well-Ordering Principle. However, we shall not concern ourselves
with that fact right now. In this section our major goal will center on understanding and
using the Principle of Mathematical Induction. (But in the exercises for Section 4.2 we shall
examine how the Principle of Mathematical Induction is used to prove the Well-Ordering
Principle.)
4.1 The Well-Ordering Principle: Mathematical Induction 195
In the statement of Theorem 4.1 the condition in part (a) is referred to as the basis step,
while that in part (b) is called the inductive step.
The choice of 1 in the first condition of Theorem 4.1 is not mandatory. All that is needed
is for the open statement S(7) to be true for some first element ng € Z so that the induction
process has a starting place. We need the truth of S(1o) for our basis step. The integer no
could be 5 just as well as 1. It could even be zero or negative because the set Z* in union
with {0} or any finite set of negative integers is well ordered. (When we do an induction
proof and start with mp < 0, we are considering the set of all consecutive negative integers
> no in union with {0} and Z*.)
Under these circumstances, we may express the Principle of Mathematical Induction,
using quantifiers, as
[S(no) A [Wk > no [S(K) => SK + D> Vn 2 no S(n).
We may get a somewhat better understanding of why this method of proof is valid by
using our intuition in conjunction with the situation presented in Fig. 4.1.
—
Ng Ng + 1 Ng + 2 Ng
+3
—_
|
k k+1
(b)
Ng No + 1 No + 2 No + 3
(c)
Figure 4.1
In part (a) of the figure we see the first four of an infinite (ordered) arrangement of
dominos, each standing on end. The spacing between any two consecutive dominos is
always the same, and it is such that if any one domino (say the kth) is pushed over to
the right, then it will knock over the next ({k + 1)st) domino. This process is suggested
in Fig. 4.1(b). Our intuition leads us to feel that this process will continue, the (k + 1)st
domino toppling and knocking over (to the right) the (kK + 2)nd domino, and so on. Part (c)
of the figure indicates how the truth of S(o) provides the push (to the right) to the first
domino (at v9). This provides the basis step and sets the process in motion. The truth of S(k)
196 Chapter 4 Properties of the Integers: Mathematical Induction
forcing the truth of S(k + 1) gives us the inductive step and continues the toppling process.
We then infer the fact that S(m) is true for all n > no as we imagine ail the successive
dominos toppling (to the right.)
We shall now demonstrate several results that call for the use of Theorem 4.1.
Forall Zt, S77,
ne §=142434---4n= oe
EXAMPLE 4.1
Proof: Forn = | the open statement
n(n+ 1)
Sm): SCi=142434---40= 2
i=l
becomes S(1): $o}_, i = 1 = (1) 4 1)/2. So S(1) is true and we have our basis step —
and a starting point from which to begin the induction. Assuming the result true for n = k
(for some k € Z*), we want to establish our inductive step by showing how the truth of
S(k) “forces” us to accept the truth of S(k + 1). [The assumption of the truth of S(k) is our
induction hypothesis.\ To establish the truth of S(k + 1), we need to show that
Si - ———.
(k + 1)(k +2)
i=l
We proceed as follows.
k+l k
k(k+1
Soi =142+4+---4+k4+(k+1)= (>>. +({kK+1)= Oe ED,
i=] i=l
for we are assuming the truth of $(k). But
MEAD
k+] 5 Gy 1 = MAAD 1 2 ED] _ DES?)
k+1)(kK4+2
establishing the inductive step [condition (b)] of the theorem.
Consequently, by the Principle of Mathematical Induction, S(7) is true for all n € Z*.
Now that we have obtained the summation formula for }*"_, i in two ways (see Ex-
ample 1.40), we shall digress from our main topic and consider two examples that use this
summation formula.
A wheel of fortune has the numbers from 1 to 36 painted on it in a random manner. Show
EXAMPLE 4.2
that regardless of how the numbers are situated, there are three consecutive (on the wheel)
numbers whose total is 55 or more.
Let x; be any number on the wheel. Counting clockwise from x, label the other numbers
X2, X3,..., X36. For the result to be false, we must have x; + x2 + x3 < 55, x. 4+ x3 4X4 <
55,2... X34 +35 + x36 < 55, x35 + x36 +X) < 55, and x36 + x) + x2 < 55. In these 36
inequalities, each of the terms x), %2, ..., X36 appears (exactly) three times, so each of the
integers 1, 2, ... , 36 appears (exactly) three times. Adding all 36 inequalities, we find that
3 S096, x, = 3 978, i < 36(55) = 1980. But 5°26, i = (36)(37)/2 = 666, and this gives
us the contradiction that 1998 = 3(666) < 1980.
Among the 900 three-digit integers (from 100 to 999) those such as 131, 222, 303, 717,
EXAMPLE 4.3
848, and 969, where the integer is the same whether it is read from left to right or from
4.1 The Well-Ordering Principle: Mathematical Induction 197
right to left, are called palindromes. Without actually determining all of these three-digit
palindromes, we would like to determine their sum.
The typical palindrome under study here has the form aba = 100a + 10b+a =
10la + 10b, where 1<a<9 and 0<b <9. With nine choices for a and ten for b,
it follows from the rule of product that there are 90 such three-digit palindromes. Their
sum is
y (> ch) = s y aba = » Sota + 10b)
a=1 b=0 a=1 b=0
9 9 9 9
= >> ote + yr = > | oct + oye
b=( a=] b=]
9 9
10(9- 10)
=) [10104 + “|-
— d| (1010a + 450)
a=
pas
1010 s a + 9(450)
a=]
1010(9 - 10
- — + 4050 = 49,500.
The next summation formula takes us from first powers to squares.
Prove that for each n € Z*,
EXAMPLE 4.4
Le _ a(n+ a + I
Proof: Here we are dealing with the open statement
Sin): yi 2 n(n + — + Ly
Basis Step: We start with the statement S(1) and find that
Se _ p—!d+dD@0)
: +) .
i=]
so S(1) is true.
Inductive Step: Now we assume the truth of S(k), for some (particular) k ¢ Z* —that
is, we assume that
k
a. Mkt Dek +1)
i=
is a true statement (when n is replaced by k). From this assumption we want to deduce the
truth of
(K+ IK + D+ beey n> 1)
k+1
Skt): P=
+ 3)
_ (k+ Ik +P 2)(2k .
198 Chapter 4 Properties of the Integers: Mathematical Induction
Using the induction hypothesis S(k), we find that
k+1 k
SOP HP HP ee 4P$ ker = VP 4k +1"
t=]
_ — pers 2] LED? i=l
2k +1 2k? + 7k +6
= e+ [EY een] a4 RAAF
_ ke + IK +2)Q2k +3)
6 3
and the general result follows by the Principle of Mathematical Induction.
The formulas from Examples 4.1 and 4.4 prove handy in deriving our next result.
Figure 4.2 provides the first four entries of the sequence of triangular numbers. We see
EXAMPLE 4.5
that tf; = 1,6 =3, tz = 6, t, = 10, and, in general, tp =142+--.-+i7= iG + 1)/2,
for eachi € Z*. Fora fixed n € Z* we want a formula for the sum of the first n triangular
numbers —that is, 4) +f +---+t% = )\7_, tj. When n =2 we have t; + fy = 4. For
n = 3 the sum is 10. Considering » fixed (but arbitrary) we find that
+1) Toy . loan 1S,
nt n
vot ~sS =ZLW@ roadie +s de
l|
i=1 i=]
1[atnt+DQn41) 1 fntn+1) 4 | 2a+1 1
=_/{5 r + _~|;—~——
5 5 |= n(n+ 1 1) D + _4
— a(n t+ 1)(n + 2)
r ;
Consequently, if we wish to know the sum of the first 100 triangular numbers, we have
100(101)(102
hte hg = — a = 171,700,
e
e e e
e e e e e e
e e e e s e e e 6 e
t= 1 tp=14+2 tz=14+243 tp=1+24+3+4
_1+2 _3_2:3 _-_3:4 ay 425
2 “355 em "5 a)
Figure 4.2
Before we present any more results, let us note how we started the proofs in Examples 4.1
and 4.4. In both cases we simply replaced the variable n by 1 and verified the truth of some
rather easy equalities. Considering how the inductive step in each of these proofs was
4.1 The Well-Ordering Principle: Mathematical Induction 199
definitely more complicated to establish, we might question the need for bothering with
these basis steps. So let us examine the following example.
; a. ;
+ establish the validity of the open statemen
EXAMPLE 4.6 If n € Z,
it
_ne+n+2
Sin): DOi=14+2434---40 5 .
i=]
This time we shall go directly to the inductive step. Assuming the truth of the statement
k ke2 +k4+2
S(k): > b=142434---+k = —— —
i=]
for some (particular) k € Z*, we want to infer the truth of the statement
k+1 ;
142434--4ke+ 4D =5 k+1)°) +(kK4 ) 2
: +1)4+
S(k + 1): Sia
TT 243k 44
5
As we did previously, we use the induction hypothesis and calculate as follows:
k+l k
Pisteaaesneaey=(Li) easy
i=l 2 i=1
ke+k+2
= —— 2 +k +1)
_ RetK+2 | 2kK+2 _ k?+3k+4
2 2 20°
Hence, for each k € Z*, it follows that S(k) > S(k + 1). But before we decide to accept
the statement Vn S(n) as a true statement, let us reconsider Example 4.1. From that example
we learned that )°;_, i = n(n + 1)/2, forall € Z*. Therefore, we can use these two results
(from Example 4.1 and the one already “established” here) to conclude that for all n € Zt,
nnatl)
—— = d!
CQ,=
vnr+n42
i=
which implies that n(n + 1) = n? +n +2 and 0 = 2. (Something is wrong somewhere!)
Ifn = 1,then )°!_, 1 = 1, but (n? +n + 2)/2 = (14+ 1+42)/2 = 2. So S(1) is nottrue.
But we may feel that this result just indicates that we have the wrong starting point. Perhaps
S(n) is true for all n > 7, or all > 137. Using the preceding argument, however, we know
that for any starting point np € Z*, if S(mo) were true, then
2 no
No + No +2
OO = SPH; 142434---
4200.
2 i=l
From the result in Example 4.1 we have ye i = No(no + 1)/2, so it follows once again
that 0 = 2, and we have no possible starting point.
This example should indicate to the reader the need to establish the basis step —no
matter how easy it may be to verify it.
200 Chapter 4 Properties of the Integers: Mathematical Induction
Now consider the following pseudocode procedures. The procedure in Fig. 4.3 uses a for
loop to accumulate the sum of the squares. The second procedure (Fig. 4.4) demonstrates
how the result of Example 4.4 can be used in place of such a loop. In both procedures the input
is a positive integer n and the output is an i. However, whereas the pseudocode within
the for loop of the procedure in Fig. 4.3 entails a total of n additions and n multiplications
(not to mention the n — 1 additions for incrementing the counter variable 7), the procedure
in Fig. 4.4 requires only two additions, three multiplications, and one (integer) division.
And this total number of additions, multiplications, and (integer) divisions is still 6 as the
value of n increases. Consequently, the procedure in Fig. 4.4 is considered more efficient.
(This idea of a more efficient procedure will be examined further in Sections 5.7 and 5.8.)
procedure SumOfSquares1 (n: positive integer)
begin
sum :=0
for i :=1tondo
sum := sum+ i*
end
Figure 4.3
procedure SumOfSquares2 (n: positive integer)
begin
sum:=n* (n+1)* (2*nm+1)/6
end
Figure 4.4
Looking back at our first two applications of mathematical induction (in Examples 4.1
and 4.4), we might wonder whether this principle applies only to the verification of known
summation formulas. The next seven examples show that mathematical induction is a vital
tool in many other circumstances as well.
Let us consider the sums of consecutive odd positive integers.
EXAMPLE 4.7
1) 1 =] (= 17)
2)143 =4 (= 27)
3) 1+345 =9 (= 37)
4)14+3+54+7 = 16 (= 4°)
From these first four cases we conjecture the following result: The sum of the first n
consecutive odd positive integers is n*: that is, for alln € ZT,
n
S(n): S-(2i —l)=n’.
i=]
Now that we have developed what we feel is a true summation formula, we use the
Principle of Mathematical Induction to verify its truth for all n > 1.
4.1 The Well-Ordering Principle: Mathematical Induction 201
From the preceding calculations, we see that $(1) is true [as are $(2), §(3), and S(4)],
and so we have our basis step. For the inductive step we assume the truth of $(k) for some
k (> 1) and have
k
> (i —1)=k.
i=]
We now deduce the truth of S(k + 1): eas; (2i — 1) = (k + 1)*. Since we have assumed
the truth of S(k), our induction hypothesis, we may now write
k+] k
S(Qi-D = CQ -1) + 2K 4D -N RP +241) -1
i=] i=]
=k? +2k4+1=(k+1)’.
Consequently, the result S(m) is true for all n > 1, by the Principle of Mathematical
Induction,
Now it is time to investigate some results that are not summation formulas.
In Table 4.1, we have listed in adjacent columns the values of 4 and n> — 7 for the positive
EXAMPLE 4.8
integers n, where | <n < 8. From the table, we see that (n* — 7) < 4n forn = 1, 2, 3, 4,5;
but when n = 6, 7, 8, we have 4n < (n* — 7). These last three observations lead us to
conjecture: For all n > 6, 4n < (n* — 7).
Table 4.1
n 4n n—-7Tin 4n n?—7
1 4 —6 5 20 18
2 8 —3 6 24 29
3 12 2 7 28 42
4 16 9 8 32 57
Once again, the Principle of Mathematical Induction is the proof technique we need to
verify our conjecture. Let $() denote the open statement: 42 < (n? — 7). Then Table 4.1
confirms that $(6) is true [as are S(7) and S(8)], and we have our basis step. (At last we
have an example wherein the starting point is an integer np # 1.)
In this example, the induction hypothesis is S(k): 4k < (k* — 7), where k € Z* and
k > 6. In order to establish the inductive step, we need to obtain the truth of S(k + 1) from
that of S(k). That is, from 4k < (k? — 7) we must conclude that 4(k + 1) < [(k + 1)* — 7].
Here are the necessary steps:
4k < (k* —7) 3 4k 4+4 < (k* —7)4+4< (RP —7) 4+ (Qk 4-1)
(because for
k > 6, we find 2k + 1 > 13 > 4), and
Ak +4 < (k* —7) + (2k +1) 3 4K +1) < (kK? +2k +1) -—7 = (k 41)? 7.
Therefore, by the Principle of Mathematical Induction, ${7) is true for all n > 6.
202 Chapter 4 Properties of the Integers: Mathematical Induction
Among the many interesting sequences of numbers encountered in discrete mathematics
EXAMPLE 4.9
and combinatorics, one finds the harmonic numbers H,, Hz, H3,..., where
H,=1
Ay =1+
2
H3=1+4+ ! + Y
° 2 3
° 9
and, in general, H, = 1 + , + ; fee. 1 for each n eZ.
The following property of the harmonic numbers provides one more opportunity for us
to apply the Principle of Mathematical Induction.
Foralln € Z*, )° Hy = (1 +1), ~ 1.
j=i
Proof: As we have done in the earlier examples (that is, Examples 4.1, 4.4, and 4.7), we
verify the basis step atn = 1 for the open statement S(): Vie H; = (n+ 1)H, —n. This
result follows readily from
So Hj) =H =1=2-1-1=(4+DA-1.
j=!
To verify the inductive step, we assume the truth of S(k), that is,
k
S2 A; = (k+1)A, —k.
j=l
This assumption then leads us to the following:
k+l k
So A; = >> A; + Aya = [K+ DAR - I+ Aes
yet ys = (k+ 1H —k + Hess
= (k+ Dl Hest — A/(k + D)) — e+ Aes
= (k+2)Hpy1-1—k
= (k +2) Hei — (K+ 0):
Consequently, we now know from the Principle of Mathematical Induction that $(m) is true
for all positive integers n.
For all n > 0 let A,, C R, where |A,,| = 2” and the elements of A,, are listed in ascending
EXAMPLE 4.10
order. If r € R, prove that in order to determine whether r € A,, (by the procedure developed
below), we must compare r with no more than n + 1 elements in A,.
When n = 0, Ao = {a} and only one comparison is needed. So the result is true for
n = 0 (and we have our basis step). For n = 1, A; = {a,, a2} with a; < a2. In order to
determine whether r € A;, at most two comparisons must be made. Hence the result follows
when n = 1. Now if n = 2, we write Ar = {b;, bo, c), co} = By UC), where b; < by <
C1} < Co, By = {b), b2}, and C; = {c;, co}. Comparing r with 62, we determine which of
the two possibilities— (1) r € By; or (ii) r € C; —can occur. Since |B,| = |C;| = 2, either
one of the two possibilities requires at most two more comparisons (from the prior case
4.1 The Well-Ordering Principle: Mathematical Induction 203
where 1 = 1). Consequently, we can determine whether r € Az by making no more than
2+ 1=n+1 comparisons.
We now argue in general. Assume the result true for some & > 0 and consider the case for
Aga, where |Az41| = 2‘+!. In order to establish our inductive step, let Ag4; = By U Cx,
where |B; | = |C,| = 2*, and the elements of B,, C; are in ascending order with the largest
element x in B; smaller than the least element in C;,. Let r €¢ R. To determine whether
r € Ag4i, we consider whether r € By or r € Cy.
a) First we compare r and x. (One comparison)
b) If r <x, then because | B;| = 2*, it follows by the induction hypothesis that we can
determine whether r € B, by making no more than k + 1 additional comparisons.
c) If r > x, we do likewise with the elements in C;. We make at most & + 1 additional
comparisons to see whether r € Cx.
In any event, at most (k + 1) + 1 comparisons are made.
The general result now follows by the Principle of Mathematical Induction.
One of our first concerns when we evaluate the quality of a computer program is whether
EXAMPLE 4.11
the program does what it is supposed to do. Just as we cannot prove a theorem by checking
specific cases, so we cannot establish the correctness of a program simply by testing various
sets of data. (Furthermore, doing this would be quite difficult if our program were to become
a part of a larger software package wherein, perhaps, a data set is internally generated.) Since
software development places a great deal of emphasis on structured programming, this has
brought about the need for program verification. Here the programmer or the programming
team must prove that the program being developed is correct regardless of the data set
supplied. The effort invested at this stage considerably reduces the time that must be spent
in debugging the program (or software package). One of the methods that can play a major
role in such program verification is mathematical induction. Let us see how.
The pseudocode program segment shown in Fig. 4.5 is supposed to produce the answer
x(y") for real variables x, y with nm a nonnegative integer. (The values for these three
variables are assigned earlier in the program.) We shall verify the correctness of this program
segment by mathematical induction for the open statement.
S(n): For all x, y € R, if the program reaches the top of the while loop with n € N, after
the loop is bypassed (for n = 0) or the two loop instructions are executed n (> 0) times,
then the value of the real variable answer is x(y").
while
n # 0 do
begin
X:=x*y
n:=n-1
end
answer := xX
Figure 4.5
The flowchart for this program segment is shown in Fig. 4.6. Referring to it will help us
as we develop our proof.
204 Chapter 4 Properties of the Integers: Mathematical Induction
Initialize the
real variables
x, yand the
nonnegative
integer variable n
” The top of
the while loop
answer := x
Xi=xX*y The program continues
n=n- with the next executable
a statement following the
assignment statement for
the real variable answer.
Figure 4.6
First consider $(0), the statement for the case where n = 0. Here the program reaches the
top of the while loop, but since n = 0, it follows the No branch in the flowchart and assigns
the value x = x(1) = x(y°) to the real variable answer. Consequently, the statement $(0)
is true and the basis step of our induction argument is established.
Now we assume the truth of $(k), for some nonnegative integer k. This provides us with
the induction hypothesis.
S(k): For all x, y € R, if the program reaches the top of the while loop with k € N, after
the loop is bypassed (for & = 0) or the two loop instructions are executed k (> Q) times,
then the value of the real variable answer is x(y*).
Continuing with the inductive step of the proof, when dealing with the statement
S(k + 1), we note that because k +1> 1, the program will not simply follow the No
branch and bypass the instructions in the while loop. Those two instructions (in the while
loop) will be executed at least once. When the program reaches the top of the while loop for
the first time, = k + 1 > 0, so the loop instructions are executed and the program returns
to the top of the while loop where now we find that
e The value of y is unchanged.
e The value of x is x; = x(y!) = xy.
® The value
of nis (kK +1)—-—1=k.
But now, by our induction hypothesis (applied to the real numbers x, y), we know that
after the while loop for x;, y andn = k is bypassed (for k = 0) or the two loop instructions
are executed & (> 0) times, then the value assigned to the real variable answer is
xi(y*) = (ry) (y*) = x0").
So by the Principle of Mathematical Induction, S(n) is true for all 7 > 0 and the correct-
ness of the program segment is established.
4.1 The Well-Ordering Principle: Mathematical Induction 205
Recall (from Examples 1.37 and 3.11) that for a given n € Z*, a composition of n is an
EXAMPLE 4.12 ns .
ordered sum of positive-integer summands summing to n. In Fig. 4.7 we find the compo-
sitions of 1, 2, 3, and 4. We see that
a) 1 has 1 = 2° = 2'~! composition, 2 has 2 = 2! = 2?-! compositions, 3 has 4 = 2? =
2?! compositions, and 4 has 8 = 23 = 24~! compositions; and
b) the eight compositions of 4 arise from the four compositions of 3 in two ways:
(i) Compositions (1’)—(4’) result by increasing the last summand (in each correspond-
ing composition of 3) by 1; (ii) Each of compositions (1”)—(4”) is obtained by ap-
pending “+1” to the corresponding composition of 3.
(n=1) 1 (n=4) (1!) 4
(2) 143
(n=2) 2 (3) 242
1+1 (4) 14142
(n=3) (1) 3 1”) 341
(2) 142 2”) 142+1
(3) 24+1 3”) 24141
(4) 14+1+41 (4) 1414141
Figure 4.7
The observations in part (a) suggest that for all n € Z*, S(n): n has 2"! compositions.
The result [in part (a)] for n = 1 provides our basis step, 5(1). So now let us assume the
result true for some (fixed) k € Z* — namely, S(k): k has 2*—' compositions. At this point
consider S(k + 1). One can develop the compositions of k + 1 from those of & as in part
(b) above (where k = 3). For k > 1, we find that the compositions of k + 1 fall into two
distinct cases:
1) The compositions of k + 1, where the last summand is an integer ¢ > 1: Here this
last summand ¢ is replaced by t — 1, and this type of replacement provides a corre-
spondence between all of the compositions of k and all those compositions of k + 1,
where the last summand exceeds 1.
2) The compositions of k + 1, where the last summand is 1: In this case we delete
“+1” from the right side of this type of composition of k + 1. Once again we get
a correspondence between all the compositions of k and all those compositions of
k + 1, where the last summand is 1.
Therefore, the number of compositions of k + 1 is twice the number for k. Conse-
quently, it follows from the induction hypothesis that the number of compositions of
k +1 is 2(2*~!) = 2*. The Principle of Mathematical Induction now tells us that for
all n € Z*, S(n):n has 2"~! compositions (as we learned earlier in Examples 1.37
and 3.11).
EXAMPLE 4.13 We learn from the equation 14 = 3 + 3 + 8 that we can express 14 using only 3’s and 8’s
as summands. But what may prove to be surprising is that for all n > 14,
S(n): n-can be written as a sum of 3’s and/or 8’s (with no regard to order).
206 Chapter 4 Properties of the Integers: Mathematical Induction
As we start to verify S(v) for all n > 14, we realize that the given introductory sentence
shows us that the basis step $(14) is true. For the inductive step we assume the truth of
S(k) for some k € Z*, where k > 14, and then consider what can happen for S(k + 1). If
there is at least one 8 in the sum (of 3’s and/or 8’s) that equals k, then we can replace this 8
by three 3’s and obtain k + 1 as a sum of 3’s and/or 8’s. But suppose that no 8 appears as a
summand of k. Then the only summand used is a 3, and, since k > 14, we must have at least
five 3’s as summands. And now if we replace five of these 3’s by two 8’s, we obtain the
sum k + 1, where the only summands are 3’s and/or 8’s. Consequently, we have shown how
S(k) => S(k + 1) and so the result follows for all n > 14 by the Principle of Mathematical
Induction.
Now that we have seen several applications of the Principle of Mathematical Induction,
we Shall close this section by introducing another form of mathematical induction. This sec-
ond form is sometimes referred to as the Alternative Form of the Principle of Mathematical
Induction or the Principle of Strong Mathematical Induction.
Once again we shall consider a statement of the form Wn > no S(n), where ny € Z*, and
we shall establish both a basis step and an inductive step. However, this time the basis step
may require proving more than just the first case — where n = ng. And in the inductive step
we shall assume the truth of all the statements S(79), S(mgp + 1), ..., S(K — 1), and S{(k),
in order to establish the truth of the statement S(k + 1). We formally present this second
Principle of Mathematical Induction in the following theorem.
THEOREM 4.2 The Principle of Mathematical Induction —Alternative Form. Let S(n) denote an open
mathematical statement (or set of such open statements) that involves one or more oc-
currences of the variable n, which represents a positive integer. Also let ng, ny € Z* with
No SHY.
a) If S(mo), S(to + 1), Smo + 2), ..., SC) — 1), and S(n,) are true; and
b) If whenever S(no), S(#o + 1), ..., S(K — 1), and S(k) are true for some (particular
but arbitrarily chosen) k € Z+, where k > m1, then the statement $(k + 1) is also true;
then S(n) is true for all n > ng.
As in Theorem 4.1, condition (a) is called the basis step and condition (b) is called the
inductive step.
The proof of Theorem 4.2 is similar to that of Theorem 4.1 and will be requested in the
Section Exercises. We shall also learn in the exercises for Section 4.2 that the two forms
of mathematical induction (given in Theorems 4.1 and 4.2) are equivalent, for each can be
shown to be a valid proof technique when we assume the truth of the other.
Before we give any examples where Theorem 4.2 is applied, let us mention, as we did
for Theorem 4.1, that 79 need not actually be a positive integer — it may, in reality, be 0 or
even possibly a negative integer. And now that we have taken care of that point once again,
let us see how we might apply this new proof technique.
Our first example should be familiar. We shall simply apply Theorem 4.2 in order to
obtain the result in Example 4.13 in a second way.
4.1 The Well-Ordering Principle: Mathematical Induction 207
The following calculations indicate that it is possible to write (without regard to order) the
EXAMPLE 4.14 integers 14, 15, 16 using only 3’s and/or 8’s as summands:
14=343+8 15=34+343+4+3+4+3 16=8+8
On the basis of these three results, we make the conjecture
For every n € Zt where n > 14,
S(n): can be written as a sum of 3’s and/or 8’s.
Proof: It is apparent that the statements $(14), S(15), and $(16) are true
— and this estab-
lishes our basis step. (Here np = 14 and n, = 16.)
For the inductive step we assume the truth of the statements
S(14), S15), ..., S(K — 2), S(k — 1), and S(k)
for some k € Zt, where k > 16. [The assumption of the truth of these (k — 14) + 1 state-
ments constitutes our induction hypothesis.] Andnowifn =k + 1,thenn > 17andk +1 =
(k — 2) +3. But since 14 <k —2<k, from the truth of S(k — 2) we know that (k — 2)
can be written as a sum of 3’s and/or 8’s; so (kK + 1) = (k — 2) +3 can also be written in
this form. Consequently, $() is true for all n > 14 by the alternative form of the Principle
of Mathematical Induction.
In Example 4.14 we saw how the truth of S(k + 1) was deduced by using the truth of the
one prior result S(k ~ 2). Our last example presents a situation wherein the truth of more
than one prior result is needed.
Let us consider the integer sequence dy, @;, 42, 43, ... , where
I EXAMPLE 4.15
ay = 1, a, = 2, a = 3, and
Gn = G@n—-1 + An_2 + Qn_3, forall n © Z* wheren > 3.
(Then, for instance, we find that a3 = a2 + aj + a9 = 34+2+4+1=6; a4 =a3 +a. + 4, =
64+342=1l,andas5 =a,+a34+ a = 11+6+43 = 20.)
We claim that the entries in this sequence are such that a, < 3” for all n € N— that is,
Vn eéN S’(n), where S’(n) is the open statement: a, < 3”.
For the basis step, we observe that
i) dg = 1=3° <3°;
ii) a) =2<3 =3!'; and
iii) a7 =3 <9 = 3°.
Consequently, we know that S’(0), S’(1), and S’(2) are true statements.
So now we turn our attention to the inductive step where we assume the truth of the
statements S’(0), S’(1), S’(2),..., S’(k — 1), S'(k), for some k € Z* where k > 2. For
the case where n = k + 1 > 3 we see that
Ak+1 = Ap + Ag—1 + Ag—2
< 3k 4+ 3% 4 3% = 303%) = 3411,
so [S’(k — 2) A S(k — 1) A S'(kK)] > S’(k +1).
208 Chapter 4 Properties of the Integers: Mathematical Induction
Therefore it follows from the alternative form of the Principle of Mathematical Induction
that a, < 3" foralln EN.
Before we close this section, let us take a second look at the preceding two results. In
both Example 4.14 and Example 4.15 we established the basis step by verifying the truth
of three statements: $(14), S(15), and $(16) in Example 4.14; and, $’(0), S’(1), and S’(2)
in Example 4.15. However, to obtain the truth of S(k + 1) in Example 4.14, we actually
used only one of the (k — 14) + 1 statements in the induction hypothesis — namely, the
statement $(k — 2). For Example 4.15 we used three of the k + 1 statements in the induction
hypothesis — in this case, the statements S’(k — 2), S’(k — 1), and S’(k).
for i :=1to0123 do
for j :=1ltoido
print i*j
1. Prove each of the following for all n > 1 by the Principle
of Mathematical Induction. a) How many times is the print statement of the third line
2n — 1)(2 1 executed?
a) 2432452 4---4+(Qn—12 = mee
b) Replace i in the second line by 7, and answer the ques-
b) 1-342-44+3-54+---+n(n+2) = tion in part (a).
n(n + 1)(2n + 7) 6. a) For the four-digit integers (from 1000 to 9999) how
6 many are palindromes and what is their sum?
- { n b) Write a computer program to check the answer for the
° DL iGtD a+) sum in part (a).
fn n(n 4 1) n
2 7. Alumberjack has 4n + 110 logs in a pile consisting of7 lay-
d p- - = i ers. Each layer has two more logs than the layer directly above
it. If the top layer has six logs, how many layers are there?
2. Establish each of the following for all > 1 by the Principle 8. Determine the positive integer 7 for which
of Mathematical Induction.
yieye
2n n
n n—l
a) yo27 =yo2 =?" _]
1=] i=]
r=1 7=0
9, Evaluate each of the following:
b) i(2') =24+ (n—1)2""! 33. 33. 72
i=1
a) ea! b) duu l-
10. Determine 10 t,, where ¢, denotes the ith triangular
5°!"
c) POE) =@+)I-1 number, for 51 <1 < 100,
1=1
11. a) Derive a formula for yn t>,, where ft); denotes the 2ith
3. a) Note how YO, P4412 =) ",G4+ 18 = triangular number for 1 <7 <n.
yo Fo + 3i7 + 3i + 1). Use this result to obtain a for-
mula for 5>"_, (2. (Compare with the formula given in b) Determine 571% h,.
Example 4.4.) c) Write acomputer program to check the result in part (b).
b) Use the idea presented in part (a) to find a formula 12. a) Prove that (cos @ + i sin 0)? = cos 26 + i sin 20,
for yr i> and one for yr i+, [Compare the result for wherei € C andi* = —1.
)-*_, 2 with the formula in part (d) of Exercise ! for this b) Using induction, prove that for all n € Z*,
section.]
(cos 8 +i sin @)" = cos n@ +i sinné.
4. A wheel of fortune has the integers from | to 25 placed on it
in arandom manner. Show that regardless of how the numbers (This result is known as DeMoivre’s Theorem.)
are positioned on the wheel, there are three adjacent numbers c) Verify that 1 +7 = /2(cos 45° + i sin 45°), and com-
whose sum is at least 39. pute (1 +i)!%.
5. Consider the following program segment (written in pseu- 13. a) Consider an 8 X 8 chessboard. It contains sixty-four
docode): 1 X 1 squares and one 8 X 8 square. How many 2 x 2
4.1 The Well-Ordering Principle: Mathematical Induction 209
squares does it contain? How many 3 X 3 squares? How 21. During the execution of a certain program segment (written
many squares in total? in pseudocode), the user assigns to the integer variables x and n
b) Now consider an n Xn chessboard for some fixed any (possibly different) positive integers. The segment shown in
ne Z*. For 1<k <n, how many k X k squares are con- Fig. 4.8 immediately follows these assignments. If the program
tained in this chessboard? How many squares in total? reaches the top of the while loop, state and prove (by mathe-
matical induction) what the value assigned to answer will be
14. Prove that for alln € Z*,n>3>52" <n!
after the two loop instructions are executed n (> 0) times.
15. Prove that for alln € Z*,n >4=>n? <2".
16. a) Forn = 3 let X3 = {1, 2, 3}. Now consider the sum
rr whilen # 0 do
begin
5 — —_ — —— — —— ———
75253 57-275 4-352-3571.2-3
xX :=xX*n
{
n:i=n-l
wxAcx, PA end
where p denotes the product of all elements in anonempty answer :=xX
subset A of X3. Note that the sum is taken over all the
nonempty subsets of X3. Evaluate this sum. Figure 4.8
b) Repeat the calculation in part (a) for s; (wheren = 2 and 22. In the program segment shown in Fig. 4.9, x, y, and answer
X> = {1, 2}) and sy (wheren = 4 and X4 = {1, 2, 3, 4}). are real variables, and n is an integer variable. Prior to execu-
c) Conjecture the general result suggested by the calcula- tion of this while loop, the user supplies real values for x and y
tions from parts (a) and (b). Prove your conjecture using and a nonnegative integer value for n. Prove (by mathematical
the Principle of Mathematical Induction. induction) that for all x, y € R, if the program reaches the top
17. For n € Z*, let H,, denote the nth harmonic number (as of the while loop with n € N, after the loop is bypassed (for
defined in Example 4.9). n = 0) or the two loop instructions are executed n (> 0) times,
then the value assigned to answer is x + ny.
a) For all 2 € N prove that 1 + (5) < Ann.
b) Prove that for alln € Z*,
$n = [MEP] a [ROED].
a.
j=l
n(n+ 1) n(n
t+ 1) while n # 0 do
begin
xXi=xX+y
18. Consider the following four equations:
n:=n-1
1) 1=1 end
answer :=xX
2) 2+3+4=1+8
3) 5+64+74+8+9=8+27 Figure 4.9
4) 104+ 114 12+ 134+ 14+ 15 + 16 = 27+ 64
23. a) Let n € Z*, where n # 1, 3. Prove that n can be ex-
Conjecture the general formula suggested by these four equa-
pressed as a sum of 2’s and/or 5’s.
tions, and prove your conjecture.
b) For all € Z* show that ifn > 24, thenn can be written
19. For n € Z*, let S(n) be the open statement
as a sum of 5’s and/or 7’s.
Si = MEU’ 24. A sequence of numbers a, a2, a3, .. . is defined by
z=1 2
a, = 1 a, =2 Qn = An—-| + QAn_2,n
> 3.
Show that the truth of $(k) implies the truth of S(k + 1) for all
a) Determine the values of a3, @4, a5, ds, and a7.
k €Z*. Is S(n) true for alln € Z*?
b) Prove that for all n > 1, a, < (7/4)".
20. Let S; and S; be two sets where |5,| = m, |S:| =r, for
m,r €Z*, and the elements in each of S;, S> are in ascend- 25. For a fixed n € Z*, let X be the random variable where
ing order. It can be shown that the elements in S; and S$, can be Pr(X =x)= i, x = 1,2,3,...,”. (Here X is called a uni-
merged into ascending order by making no more thanm + r — 1 form discrete random variable.) Determine E(X) and Var(X).
comparisons. (See Lemma 12.1.) Use this result to establish the 26. Let ay be a fixed constant and, for n> 1, let a, =
following. yo (Farag.
Forn > 0, let S be aset with |S| = 2”. Prove that the number
a) Show that a, = aj and that a2 = 2a;.
of comparisons needed to place the elements of S in ascending
b) Determine a3 and a, in terms of ap.
order is bounded above by n - 2”.
210 Chapter 4 Properties of the Integers: Mathematical Induction
c) Conjecture a formula for a, in terms of dg when n > 0. 28. a) Of the 2°-' = 2+ = 16 compositions of 5, determine
Prove your conjecture using the Principle of Mathematical how many start with (1) 1; (11) 2; (iii) 3; (iv) 4; and (v) 5.
Induction. b) Provide a combinatorial proof for the result in part (a)
27. Verify Theorem 4.2. of Exercise 2.
4.2
Recursive Definitions
Let us start this section by considering the integer sequence bo, by, bz, b3,..., where
b, = 2n for alln € N. Here we find thatbb = 2.0 =0, b) =2-1=2, bo) =2-2= 4, and
b; = 2-3 = 6. If, for instance, we need to determine be, we simply calculate bg = 2-6 =
12 — without the need to calculate the value of b, for any other n € N. We can perform
such calculations because we have an explicit formula— namely, b, = 2n — that tells us
how b,, is determined from n (alone).
In Example 4.15 of the preceding section, however, we considered the integer sequence
ao, aj, 42, 43,..., where
ao = 1,4, =2, a =3, and
Gn = An—-1 t@n-2 + aG,-3, foralln € Z* where n > 3.
Here we do not have an explicit formula that defines each a, in terms of v for all vn EN.
If we want the value of ag, for example, we need to know the values of as, a4, and a3.
And these values (of a5, a4, and a3) require that we also know the values of a2, a), and ag.
Unlike the rather easy situation where we determined bg = 2 - 6 = 12, in order to calculate
ae, here we might find ourselves writing
ag = a5 + a4
+ 43
(a4 + a3 + a2) + (a3 + G2 +.) + (a2 + a) + a)
= [(a3 + a. +. 4)) + (a2 + a + G0) + 2)
+ [(a2
+ a) + ao) + a2 + ay) + (a2 + a; + a0)
= [[(a2 + a) + ao) + a2 + ai) + (2 +. + 0)
+ a2]
+ [(a2 + a + ao)
+ a2 + ay) + (a2 + G1 + a0)
=[(34+24+1)+3+2)4+8+2+4+1)4+3]
+[((334+2+1)4+342]4+ (34241)
= 37.
Or, in a somewhat easier manner, we could have gone in the opposite direction with these
considerations:
a3 =a, +a; +a =34+24+1=6
a4=a3+a.+a, =64+342=11
a5 =~ a4 +a3 +a, =114+6+4+3
= 20
ag
= as +a4,+a3 = 204+ 1146
= 37.
No matter how we arrive at ds, we realize that the two integer sequences
— bo, bj, bo,
b3,..., and ao, a), @, a3, ...—are more than just numerically different. The integers
bo, b1, bo, b3, ..., can be very readily listed as 0, 2, 4, 6, ..., and for any 1 € N we have
4.2 Recursive Definitions 211
the explicit formula b, = 2n. On the other hand, we might find it rather difficult (if not
impossible) to determine such an explicit formula for the integers ag, @,, a2, 43, ..
What is happening here for a sequence of integers can also occur for other mathematical
concepts — such as sets and binary operations [as well as functions (in Chapter 5), languages
(in Chapter 6), and relations (in Chapter 7)]. Sometimes itis difficult to define a mathematical
concept in an explicit manner. But, as for the sequence ay, a, a2, a3, ... , we may be able
to define what we need in terms of similar prior results. (We shall examine what we mean by
this in several examples in this section.) When we do so we say that the concept is defined
recursively, using the method, or process, of recursion. In this way we obtain the concept
we are interested in studying — by means of a recursive definition. Hence, although we do
not have an explicit formula here for the sequence ao, a), 42, a3, ..., we do have a way of
defining the integers a,, for n € N, by recursion. The assignments
ago = 1, a, = 2, a, =3
provide a base for the recursion.
The equation
Gn = Gn-1 +Gn-2+ Gn-3, forn € Z* wheren > 3, (*)
provides the recursive process; it indicates how to obtain new entries in the sequence from
those prior results we already know (or can calculate). [Note: The integers computed from
Eq. (*) may also be computed from the equation @,43 = Qn42 + Gn41 + @n, forn €N.]
We now use the concept of the recursive definition to settle something that was mentioned
in three footnotes in Sections 2.1 and 2.3. After studying Section 2.2 we knew (from the
laws of logic) that for any statements p;, p2, and p3, we had
Di A (p2 A p3) = (pi A pr) A Ps,
and, consequently, we could write p; A p2 A p3 without any chance of ambiguity. This is
because the truth value for the conjunction of three statements does not depend on the way
parentheses might be introduced to direct the order of forming the conjunctions of pairs
of (given or resultant) statements. But we were concerned about what meaning we should
attach to an expression such as p; A po A p3 A pq. The following example now settles that
issue.
The logical connective A was defined (in Section 2.1) for only two statements at a time.
EXAMPLE 4.16
How, then, does one deal with an expression such as py A po A p3 A p4, where pi, P2, D3,
and p, are statements? In order to answer this question we introduce the following recur-
sive definition, wherein the concept at a certain [(n + 1)st] stage is developed from the
comparable concept at an earlier [nth] stage.
Given any statements p,, P2,..., Pa,» Pasi, We define
1) the conjunction of p), p2 by p; A po (as we did in Section 2.1), and
2) the conjunction of pj, p2,.-.. Pn» Pn4i, forn > 2, by
Pi A P2 A+++ A Pa A Pasi <= (PLA pr2 A+++ A Pa) A Pasi.
212 Chapter 4 Properties of the Integers: Mathematical Induction
[The result in (1) establishes the base for the recursion, while the logical equivalence in (2)
is used to provide the recursive process. Note that the statement on the right-hand side of
the logical equivalence in (2) is the conjunction of two statements: p,+) and the previously
determined statement (p; A p2 A--+A pn).]
Therefore, we define the conjunction of p;, p2, p3, ps by
P| A p2 A p3 A pa <= (pi A pr A p3) A Ps.
Then, by the associative law of A, we find that
(p1 A p2 A ps) A ps <=> [(pi A pa) A p3) A pg
<> (pi A pr) A (p3 A pa)
<= Pi A[p2 A (p3 A pa))
<= Pi \[(p2 A p3) A Pal
= Pi \(p2 A p34 pa).
These logical equivalences show that the truth value for the conjunction of four statements
is also independent of the way parentheses might be introduced to indicate how to associate
the given statements.
Using the above definition, we now extend our results to the following “Generalized
Associative Law for A.”
Let n € Z* where n > 3, and letr € Z* with 1 <r <n. Then
S(n): For any statements pj, po,-.., Pr, Prtis +--+» Dns
(Pi A prA+++A Pr) A (Pr4t Att A Pn) > Pi A Pr A-++A Pr A Prat A+++ A Pn:
Proof: The truth of the statement 5$(3) follows from the associative law for A and
— this
establishes the basis step for our inductive proof. For the inductive step we assume that
S(k) is true for some k > 3 and all 1 <r < k. That is, we assume the truth of
S(k): (py A pa A+++ A Pr) A rg Avs A Pe)
<= Pi \ Pz A+++ A Py WS Pro A+++ A Pk.
Then we show that $(k) = S(k + 1). When we consider k + 1 statements, then we must
account for all 1 <r <k +1.
1) Ifr =k, then
(pi A p2 A+++ A Pe) A Peoi <> Pi A po A+++ A PEA Pests
from our recursive definition.
2) For 1 <r <k, we have
(pi A po A-++ A pr) A (Prat A+++ A Pk A Pei)
<= (Pi A prAQ-++A Pr) A[(Prti Aves A Pe) A Pesil
<= [Cpr A p2 A+++ A pr) A (prt Ao A PRIA Peo
<= (Pi A pr A+++ A Pr A Prat Ao A Pk) A Prt
= PLN Pr2AQ°°++ A Pr A Prat Ao A PKA Pret:
4.2 Recursive Definitions 213
So it follows by the Principle of Mathematical Induction (Theorem 4.1) that the open
statement S(n) is true for all n € Z* where n > 3.
Our next example provides us with a second opportunity to generalize an associative
law — but this time we shall deal with sets instead of statements.
In Definition 3.10 we extended the binary operations of U and M to an arbitrary (finite or
EXAMPLE 4.17 infinite) number of subsets from a given universe U. However, these definitions do not rely
on the binary nature of the operations involved, and they do not provide a systematic way
of determining the union or intersection of any finite number of sets.
To overcome this difficulty, we consider the sets A;, A2,..., An, Any1, where A; CU
for all 1 <7 <n-+ 1, and we define their union recursively as follows:
1) The union of A;, Az is A; U Ap. (This is the base for our recursive definition.)
2) The union of Ay, Az,..., An, Anyi, for n > 2, is given by
A, UA, U---UAy U Angy = (Ay U Ad U->+ + U An) U Angi,
where the set on the right-hand side of the set equality is the union of fwo sets,
namely, A; U Az U---U A, and A,4,. (Here we have the recursive process needed
to complete our recursive definition.)
From this definition we obtain the following “Generalized Associative Law for U.” If
n,r €Z* withn > 3and1<r <n, then
S(n): (A; UA. U---UA,)
U (Apa U- ++ U An)
= A| U Az U---UA, UA; U+ ++ U Ag,
where A; © U for all 1 <i <n.
Proof: The truth of S(”) for n = 3 follows from the associative law of U, thereby providing
the basis step needed for this inductive proof. Assuming the truth of S(k) for some k € Z*,
where k > 3 and 1 <r <k, we shall now establish our inductive step by showing that
S(k) => S(k + 1). When dealing withk + 1 (> 4) sets we need toconsideralll <r <k +1.
We find that
1) Forr = k we have
(A, U Ad U+-+U Ag) U Aggy, = Ay U Ad U ++ -U AKU Aga.
This follows from the given recursive definition.
2) If 1 <r<k, then
(Ay UA? U-+-UA,) U (Apa UU Ag U Aga)
= (A, U Ap U---UA,) U[(Apay Us U Ag) U Aga]
= [(A, U A2 U---UA,) U (A, 41 UO Ag) U Aga
=(AyUAU---UA,U Apgy Us U Ag) U Aga
= AyU Ax U++-U A, U Apa Us =U Ag U Aga.
214 Chapter 4 Properties of the Integers: Mathematical Induction
So it follows by the Principle of Mathematical Induction that S(n) is true for all integers
n > 3.
Similar to the result in Example 4.17, the intersection of the n + 1 sets Aj, Az, ..., An,
An+1 (each taken from the same universe UW) is defined recursively by:
1) The intersection of A;, Az is Ay M Ad.
2) Forn > 2, the intersection of A;, A2,..., An, An41 is given by
A, MA2M+++
An MN Angi = (AiO A2N-+++M
An) ON Angi,
the intersection of the two sets Ay M1 A2M---M Ay and Ay4+1.
We find that the recursive definitions for the union and intersection of any finite number of
sets provide the means by which we can extend the DeMorgan Laws of Set Theory. We shall
establish (by using mathematical induction) one of these extensions in the next example
and request a proof of the other extension in the Section Exercises.
Let
n € Z* where n > 2, and let A,, Ao, ..., An CU for each 1 <i <n. Then
EXAMPLE 4.18
Ai MN A2N+++N
Ay = Ay U Ap U-+ + UA.
Proof: The basis step of this proof is given for = 2. It follows from the fact that Ay M Az =
A, U Az —by the second of DeMorgan’s Laws (listed in the Laws of Set Theory in Sec-
tion 3.2).
Assuming the truth of the result for some k, where k > 2, we have
Ai MN A2M->+Ag = AyU AQ U- + U Ag,
And when we consider k + 1 (> 3) sets, the induction hypothesis is used to obtain the third
set equality in the following:
Ay MA2M+-+O Ag A Aggy = (Ay A216 + OAR) Aga
= (A; MN AM-+--MAx)U Aggy = (A; UAdU---U Ag) U Aga
= A, UA2U:-- UA, U Aga:
This then establishes the inductive step in our proof and so we obtain this generalized
DeMorgan Law for all n > 2 by the Principle of Mathematical Induction.
Now that we have seen the two recursive definitions (in Examples 4.16 and 4.17), as
we continue to investigate situations where this type of definition arises, we shall generally
refrain from labeling the base and recursive parts. Likewise, we may not always designate
the basis and inductive steps in a proof by mathematical induction.
As we look back at Examples 4.16 and 4.17, the recursive definitions in these two
examples should seem similar to us. For if we interchange the statement p; with the set Aj,
for all 1 <i <n-+ 1, and if we interchange each occurrence of A with U and replace <>
with =, then we can obtain the recursive definition in Example 4.17 from the one given in
Example 4.16.
In a similar way one can recursively define the sum and product of n real numbers,
where n € Z* and n > 2. Then we can obtain (by the Principle of Mathematical Induction)
generalized associative laws for the addition and multiplication of real numbers. (In the
4.2 Recursive Definitions 215
Section Exercises the reader will be requested to do this.) We want to be aware of such
generalized associative laws because we have been using them and will continue to use them.
The reader may be surprised to learn that we have already used the generalized associative
law of addition. In each of Examples 4.1 and 4.4, for instance, the generalized associative
law of addition was used to establish the inductive step (in the proof by mathematical
induction). Furthermore, now that we are more aware of it, the generalized associative law
of addition can be used (usually, in an implicit manner) in recursive definitions — for now
there will be no chance for ambiguity if one wants to add four or more summands. For
example, we could define the sequence of harmonic numbers H,, H2, H3,..., by
1) A, = 1; and
2) Forn > 1, Angi = Hn + (s5)-
Turning from addition to multiplication, we may use the generalized associative law of
multiplication to provide a recursive definition of n!. In this case we write
1) 0! = 1; and
2) Forn > 0, (24+ 1)! = (nF 1)(n!).
(This was suggested in the paragraph following Definition 1.1 in Section 1.2.) Also, the
integer sequence bo, b,, bz, b3,..., given explicitly (at the start of this section) by the
formula b, = 2n, n € N, can now be defined recursively by
1) bo = 0; and
2) Forn > 0, bay) = by +2.
When we investigate the sequences in our next two examples, we shall once again find
recursive definitions. In addition we shail establish results where the generalized associative
law of addition will be used — although in an implicit manner.
In Section 4.1 we introduced the sequence of rational numbers called the harmonic numbers.
EXAMPLE 4.19
Now we introduce an integer sequence that is prominent in combinatorics and graph theory
(and that we shall study further in Chapters 10, 11, and 12). The Fibonacci numbers may
be defined recursively by
1) Fo = 0, F; = 1; and
2) F, = Fi, + F,_-2, forn € Z* with n > 2.
Hence, from the recursive part of this definition, it follows that
fy =F, + Fp =14+0=1 Fy = F3+ Fy =24+1=3
Py = F,4+ F, =14+1=2 fs = Fyt+ F3 =342=5.
We also find that Fg = 8, F7 = 13, Fg = 21, Fo = 34, Fin = 55, Fi, = 89, and Fy = 144.
The recursive definition of the Fibonacci numbers can be used (in conjunction with the
Principle of Mathematical Induction) to establish many of the interesting properties that
these numbers exhibit. We investigate one of these properties now.
Let us consider the following five results that deal with sums of squares of the Fibonacci
numbers.
1) Fe+FP7=04+1°=1=1X1
2) Fo + FF + F5 =0 412412 =2=1X2
3) FO + FP + FS + FG = 074407427?
=6=2X3
216 Chapter 4 Properties of the Integers: Mathematical Induction
4) P24 R24 FF 4 FR 4 FF =P 4¢P4+ P4243 =15=3X5
5) Fo + FO+ FR4+ FR 4+ FP + RR H=0C4+ P44 V4 P49 45° =40=5X8
From what is suggested in these calculations, we conjecture that
Wn eZt )) FP = Fy X Frat.
i=0
Proof: For n = 1, the result in Eq. (1) —namely, FS + F ; = ] X 1— shows us that the
conjecture is true in this first case.
Assuming the truth of the conjecture for some k > 1, we obtain the induction hypothesis:
>
S> F? = Fy X Frat.
i=0
Turning now to the case where n = k + 1 (> 2) we find that
k+\ k
Fo = 3 F* 4 Fe,, = (Fe X Fei) + Fey, = Feri X (Fe + Fest) = Fess X Fete.
i=0 i=0
Hence the truth of the case for n = k + 1 follows from the case for n = k. So the given
conjecture is true for all n € Z* by the Principle of Mathematical Induction. (The reader
may wish to note that the prior calculation uses the generalized associative law of addition.
Furthermore we employ the recursive definition of the Fibonacci numbers; it allows us to
replace Ft Fry by Fy42.)
Closely related to the Fibonacci numbers is the sequence known as the Lucas numbers. This
__ EXAMPLE 4.20 | sequence is defined recursively by
1) Lo = 2, Ly = 1; and
2) Ly = Lyi + Ly_2, for n € Z* with n > 2.
The first eight Lucas numbers are given in Table 4.2
Table 4.2
n |}O;1/2/3/4]
5] 6| 7
L, |) 2} 1/3 )4 ]7 4] 11 | 18 | 29
Although they are not as prominent as the Fibonacci numbers, the Lucas numbers also
possess many interesting properties. One of the interrelations between the Fibonacci and
Lucas numbers is illustrated in the fact that
Vn eZ! Ly = Fai t+ Fai.
Proof: Here we need to consider what happens when n = | and n = 2. We find that
L,=1=04+1=)+h=Fi-i1+Fisi, and
22=3=142=F,4+%
= Fit Fos,
so the result is true in these first two cases.
42 Recursive Definitions 217
Next we assume that L, = F,_; + F,4, for the integers n = 1,2,3,...,k—1,k,
where k > 2, and then we consider the Lucas number Z,,,. It turns out that
Lest = Le + bee = Pei + Peas) + (Pa-2 + Fr) (*)
= (Fy. + Fye_-2) + (Fea + Fh) = Fe + Fag = Feta + Feeanai-
Therefore, it follows from the alternative form of the Principle of Mathematical Induction
that L, = F,-1 + Fy41 foralln € Z*. [The reader should observe how we used the recursive
definitions for both the Fibonacci numbers and the Lucas numbers in the calculations at
(*).]
In Section 1.3 we introduced the binomial coefficients (”) for n,r ¢ N, where n > r >
EXAMPLE 4.21
0. Corollary 1.1 in that section revealed that yo (") = ar C(n, r) = 2”, the total
number of subsets for a set of size n. With the help of the result in Example 3.12 we can
(2) Ce) eres
now define these binomial coefficients recursively by
()-! (nen (na et
At this time we present a second set of numbers, each of which is also dependent on two
integers. For m, k € N, the Eulerian numbers a, , are defined recursively by
am k = (m a K)Qm—1,k-1 + (k + L)Gm—1.k, 0 < k =m — 1, (*)
ao0 = l, an k = 0, k>m, an k = 0, k<0O.
(In Exercise 18 of the Section Exercises we shall examine a situation that shows how this
recursive definition may arise.) The values for a,,,, where 1 <m<S5and0<k<m-—1,
are given as follows:
Row Sum
(m = 1) 1 1=1!
(m = 2) 1 1 2=2!
(m = 3) 1 4 l 6 = 3!
(m = 4) 1 11 11 1 24
= 4!
(m = 5) 1 26 66 26 1 120 = 5!
—] :
These results suggest that for a fixed m € Z*, S° 10 Am’ = m!, the number of permutations
of m objects taken m at a time. We see that the result is true for 1 < m <5. Assuming the
result true for some fixed m (> 1), upon using the recursive definition at (*), we find that
So amie = [Om +1 = ban ea + (kK + Dam x]
k=0 k=0
= [(m+ 1)@m,-1 + ano} + [amo + 241] + [(m _ Dam, + 3am.2] +:--
+ [34m m—3 + (m — 1)a@m m2) + [24m m-2 + Mam m—1]
+ lQinm—! + (m + L)@n mn).
218 Chapter 4 Properties of the Integers: Mathematical Induction
Since ay; = 0 = amm we can write
Wt
> Om+1,4 = [Gino + Mano] + [2dm 4 + (m — 1)ain,1) +**:
k=0
+ [(m — 1) @in,m—2 + 2Qm,m—-2) + [Mainm—1 + Amm-1|
m1
=(m +1) > amg = (m + Lym! = (m + 1)!
k=0
Consequently, the result is true for all m > 1 —by the Principle of Mathematical Induction.
(We’ll see the Eulerian numbers again in Section 9.2.)
In closing this section we shall introduce the idea of a recursively defined set X. Here we
start with an initial collection of elements that are in X —-and this provides the base of the
recursion. Then we provide a rule or list of rules that tell us how to find new elements in
X from other elements already known to be in X. This rule (or list of rules) constitutes the
recursive process. But now (and this part is new) we are also given an implicit restriction —
that is, a statement to the effect that no element can be found in the set X except for those
that were given in the initial collection or those that were formed using the prescribed rule(s)
provided in the recursive process.
We demonstrate the ideas given here in the following example.
Define the set X recursively by
EXAMPLE 4.22
1) 1 e X; and
2) Foreacha € X,a+2€ X.
Then we claim that X consists (precisely) of all positive odd integers.
Proof: If we let Y denote the set of all positive odd integers — that is, Y = {2n + 1|n € N}—
then we want to show that Y = X. This means, as we learned in Section 3.1, that we must
verify both Y C X and X CY.
In order to establish that Y C X, we must prove that every positive odd integer is in X.
This will be accomplished through the Principle of Mathematical Induction. We start by
considering the open statement
S(n): 2n4+1eXx,
which is defined for the universe N. The basis step —that is, S(O) —1s true here because
= 2(0) + 1 € X by part (1) of the recursive definition of X. For the inductive step we
assume the truth of S(k) for some k > 0; this tells us 2k + 1 is an element in X. With
2k + 1 € X it then follows by part (2) of the recursive definition of X that (2k + 1) +2 =
(2k +2) +1= 2(k +1)+1€X,so0 S(k + 1) is also true. Consequently, S() is true (by
the Principle of Mathematical Induction) for all n € N and we have Y C X.
For the proof of the opposite inclusion (namely, X C Y) we use the recursive defini-
tion of X. First we consider part (1) of the definition. Since 1 (= 2-0+ 1) is a positive
odd integer, we have | € Y. To complete the proof, we must verify that any integer in X
that results from part (2) of the recursive definition is also in Y. This is done by show-
ing that a + 2 € Y whenever the element a in X is also an element in Y. For ifaeY,
then a = 2r +1, where r e N—this by the definition of a positive odd integer. Thus
4.2 Recursive Definitions 219
a+2=(2r+1)4+2=(2r+2)+1=2(r4+1)4+1, where r+16€N (actually, Z*),
and so a + 2 is a positive odd integer. This places a + 2 in Y and now shows that X CY.
From the preceding two inclusions — that is, Y C X and X C Y —itfollows that X = Y.
7. Use the result of Example 4.17 to show that if sets
A, B,, Bo,..., B, C Wandn > 2, then
1. The integer sequence a, a2, a3, ..., defined explicitly by AM (B, UB, U---UB,)
the formula a, = 5n forn € Z*, can also be defined recursively = (AN Bi) U(AN B)U---U(ANB,).
by 8. a) Develop a recursive definition for the addition of 7 real
1) a, = 5; and numbers x), %2,..., X,, where n > 2.
2) Gn41 =, +5, forn > 1. b) For all real numbers x,, x2, and x3, the associative law of
addition states thatx; + (x2 + x3) = (x; + x2) + x3. Prove
For the integer sequence }), b2, b3,..., where b, = that if, r € Z*, wheren > 3 and1 <r <n, then
n(n + 2) for alln € Z*, we can also provide the recursive def-
inition:
(1 XQ he FX) tH Org Ho + Xn)
=X, A xX2 + EX AKL be Xp.
1) b, = 3; and
9. a) Develop a recursive definition for the multiplication of
2) baa. = b, + 2n +3, forn > 1.
n real numbers x), %3,..-,X%,, where n > 2.
Give a recursive definition for each of the following integer b) For all real numbers x,, x2, and x3, the associative law
sequences ¢C), C2, C3, ..., where for all n € Z* we have of multiplication states that x; (%2x3) = (%|X2)x3. Prove that
a) c, = 7n b) c, = 7" ifn, r € Zt, wheren > 3 and | <r <n, then
c) c, =3n+7 d) c, =7 (X1X2 +6 XM Nr pL Xn) SHA. XP p 1 Xn
ec, =n f) c, =2—(-1)" 10. For all x € R,
2. a) Give a recursive definition for the disjunction of state- if x >
|x) =Vx*2= * itx 20 . and
ments p1, P2,.--5 Pa» Psi, a= 1. —x, ifx<0
b) Show that ifn, r € Z*, withn > 3 and 1 <r <n, then —|x|<x<|x|. Consequently, |x + y|? = (x+y)? =x? 4+
(piv prV-r+V
pr) V (Prtt Vor Y Pn) Qxy ty? <x? + 2\x\ly| + y? = |x? + 2lxllyl + ly? =
(x| + |y))?, and [x + y|/? < x] + ly)? > lx tyl <
= PIV P2V-°°V
Pr V Prat Vos + VY Pre
|x| + |yl, for all x, y ER.
3. Use the result of Example 4.16 to prove that if p, qi, q2, Prove that ifn € Z*, n > 2, and x1, .x2,...,X, €R, then
.. +) Gn are statements and n > 2, then
[xy Hx. +e + x_] < ley] + x2] Fe + [xnl.
PY (1 AG2 A+++
AGn)
11. Define the integer sequence ag, a), a2, a3, ... , recursively
= (PV 4) A (PV G2) A+ A(PY Gn):
by
4, For n € Z*,n > 2, prove that for any statements pj), p2,
«+5 Pas
1) a = 1, a, = 1, a2 = J; and
V Pn) SPL A mp2 Avs App. 2) Forn > 3, ay = Gy—1 + Gy_3.
a) 4(P. V pP2V-oe
b) -(p1 A p2 A+++ A Pn) > TPL V Tp2 VV apne Prove that @,42 > (/2)" for all n > 0.
5. a) Give a recursive definition for the intersection of the
12. For n > 0 let F, denote the nth Fibonacci number, Prove
sets At, Ad, say An, An+t CU, n> 1.
that
b) Use the result in part (a) to show that for all n, r ¢ Z*
withn > 3 andi <r<n, Rth+ht-e-+h
= R= Ppl.
(A, M1 A2N---MA,)
NM (Apa) ++ Ay) i=0
= A, NAN: +> NA, MN Ara, 1-2 Ap. 13. Prove that for any positive integer n,
n
6. For n > 2 and any sets Ay, A2,...,
A, CU, prove that Fy _ Fn42
A,
U---UA,
UA} = A, MN A2N-+--MAg.- i=1 2! 2"
220 Chapter 4 Properties of the Integers: Mathematical Induction
14. As in Example 4.20 let Lp, £1, £2, ... denote the Lucas d) Suppose a permutation of 1, 2, 3, ..., m has k ascents,
numbers, where (1) Lo = 2, Ly; = 1; and (2) Lyjso = Lagi + for 0 < k <m — 1. How many descents does the permuta-
L,, forn > 0. When n > 1, prove that tion have?
LIAL + L54++-- 412 = Labay — 2. e) Consider the permutation p = 12436587. This permu-
tation of 1, 2, 3,..., 8 has four ascents. In how many of
15. Ifn EN, prove that 5Fi4. = Lassa — Ly. the nine locations (at the start, end, or between two num-
bers) in p can we place 9 so that the result is a permutation
16. Give a recursive definition for the set of all
of 1, 2,3, ..., 8, 9 with (i) four ascents; (11) five ascents?
a) positive even integers f) Let z,,, denote the number of permutations of 1, 2, 3,
b) nonnegative even integers ,m with k ascents. Note how 242 = 11 = 2(4)+
17. One of the most common uses for the recursive definition 3(1) = (44 — 2)m3,, + (2+ 1)73.2. How is 2», related to
of sets is to define the well-formed formulae in various math- Tm—1.k—1 ANd Tt ke?
ematical systems. For example, in the study of logic we can 19. a) Fork € Z* verify that k? = (§) + (*$').
define the well-formed formulae as follows:
b) Fix # in Z*. Since the result in part (a) is true for all
1) Each primitive statement p, the tautology 7), and the k= 1,2,3,...,n, summing the n equations
r=()+()
contradiction Fo are well-formed formulae; and
2) If p, g are well-formed formulae, then so are
i) (-p) ii) (pV q) iii) (p Aq)
iv) (p> q) v) (p>)
Using this recursive definition, we find that for the primitive
*=()+()
statements p, g,r, the compound statement ((p A (-q)) >
>_ {Nn n+]
(r V Ty)) is a well-formed formula. We can derive this well-
formed formula as follows: u (:) ¥( 2
Steps Reasons we have Yip-, = Vie (2) + Vian OF) = C5) +
1) p.g, 7. To Part (1) of the definition ("3°). [The last equality follows from Exercise 26 for
2) (-q) Step (1) and part (21) Section 3.1 because )°7_, (6) = G)+G)4+@)4+---+
of the definition (5)=0+ ()+G)+---+
6.22) = G2) = (3') and
3) (pA (-@)) Steps (1) and (2) and part (2iii) Xie (3) = G+Q)+G@)t--+C3')=@+
of the definition G)+ (3) 4+---4+ G41) = (77)= (°3). Show that
4) (r V To) Step (1) and part (211)
n+] n+2\ n(n+1)Qn+1)
5) ((p A (-9g)) > Vv Th)
of the definition
Steps (3) and (4) and part (2iv) ( 3 )t ( 3 ) ~ 6
of the definition c) Fork € Z* verify that k? = (4) + 4(°$') + (*4°).
For the primitive statements p, q, r, and s, provide derivations d) Use part (c) and the results from Exercise 26 for Section
showing that each of the following is a well-formed formula. 3.1 to show that
n+] n+2 n+3 n(n +1/
a) (pV q) > (Ip A (71r))) ie= 4 = —_
b) ((-p) 4) > (ACS Y Fo)))
» ("; . ( 4 )+( 4 ) 4
18. Consider the permutations of 1, 2, 3, 4. The permutation e) Find a,b,c,d€Z* so that for any ke Zt, kt =
1432, for instance, is said to have one ascent—namely, 14 a(t) +(S') +f 4h?) +a(*G?).
(since | < 4). This same permutation also has two descents — 20. a) Forn > 2, if pi, p2, p3,.-. > Pn» Pn+ are Statements,
namely, 43 (since 4 > 3) and 32 (since 3 > 2). The permutation prove that
1423, on the other hand, has two ascents, at 14 and 23 — and
[(pi > Pr) A (p2 > ps) A+++ A (Pn > Pa+i)]
the one descent 42.
a) How many permutations of 1, 2, 3 have & ascents, for
= [(p1 A p2 A pa A+++ A Pr) > Pnsil-
k =0, 1, 2? b) Prove that Theorem 4.2 implies Theorem 4.1.
b) How many permutations of 1, 2, 3,4 have & ascents, for c) Use Theorem 4.1 to establish the following: If 4 #
k =0, 1, 2, 3? SCZ", so that n € S for some n € Z*, then S$ contains a
least element.
c) If a permutation of 1, 2, 3, 4, 5, 6, 7 has four ascents,
how many descents does it have? d) Show that Theorem 4.1 implies Theorem 4.2.
4.3 The Division Algorithm: Prime Numbers 221
4.3
The Division Algorithm: Prime Numbers
Although the set Z is not closed under nonzero division, in many instances one integer
(exactly) divides another. For example, 2 divides 6 and 7 divides 21. Here the division is
exact and there is no remainder. Thus 2 dividing 6 implies the existence of a quotient—
namely, 3— such that 6 = 2 - 3. We formalize this idea as follows.
Definition 4.1 Ifa, b € Zand b ¥ 0, we say that b divides a, and we write b|a, if there is an integer n such
that a = bn. When this occurs we say that b is a divisor of a, or a is a multiple of b.
With this definition we are able to speak of division inside of Z without going to Q.
Furthermore, when ab = 0 for a, b € Z, then either a = 0 or b = 0, and we say that Z has
no proper divisors of 0. This property enables us to cancel as in 2x = 2y > x = y, for
x, y € Z, because 2x = 2y =} 2(x — y) =0>2=O0orx —y =03x = y. (Note that at
no time did we mention multiplying both sides of the equation 2x = 2y by t The number
; is outside the system Z.)
We now summarize some properties of this division operation. Whenever we divide by
an integer a, we assume that a # 0.
THEOREM 4.3 For alla,
b,c EZ
a) lla and a0. b) [(a|b) A (bla)] > a =+b.
c) [(a\b) A (b|c)] > ale. d) alb => a|bx for all x € Z.
e) Ifx = y + z, for some x, y, z € Z, and a divides two of the three integers x, y, and z,
then a divides the remaining integer.
f) [(a]b) A (alc)] = al(bx + cy), for all x, y € Z. (The expression bx + cy is called a
linear combination of b, c.)
g) For 1 <i <n, let c; € Z. If a divides each c¢;, then a|(cyx; + cox. +--+ + ¢)X%p),
where x; € Z for all 1 <i <n.
Proof: We prove part (f ) and leave the remaining parts for the reader.
If al|b and ajc, then b = am and c = an, for some m,n € Z. So bx + cy = (am)x +
(an)y = a(mx + ny) (by the Associative Law of Multiplication and the Distributive Law
of Multiplication over Addition — since the elements in Z satisfy both of these laws). Since
bx +cy = a(mx + ny), with mx + ny € Z, it follows that a|(bx 4+ cy).
We find part (g) of the theorem useful when we consider the following question.
Do there exist integers x, y, z (positive, negative, or zero) so that 6x + 9y + 15z = 107?
EXAMPLE 4.23
Suppose that such integers did exist. Then since 3/6, 3|9, and 3/15, it would follow from
part (g) of Theorem 4.3 that 3 is a divisor of 6x + 9y + 15z and, consequently, 3 is a divisor
of 107 —-but this is not so. Hence there do not exist such integers x, y, z.
Several parts of Theorem 4.3 help us in the following
222 Chapter 4 Properties of the Integers: Mathematical Induction
Let a, b € Z so that 2a + 3b is a multiple of 17. (For example, we could have a = 7 and
EXAMPLE 4.24
b = 1 —and a = 4, b = 3 also works.) Prove that 17 divides 9a + 5b.
Proof: We observe that 17|(2a + 3b) = 17|(—4)(2a + 3b), by part (d) of Theorem 4.3.
Also, since 17/17, it follows from part (f) of the theorem that 17|(17a + 17b). Hence,
17|[(17a + 17b) + (—4)(2a + 35)], by part (e) of the theorem. Consequently, as [(17a +
17b) + (—4)(2a 4+ 3b)] = [7 — 8)a + (17 — 12)b] = 9a + 5b, we have 17|(9a + 5b).
Using this binary operation of integer division we find ourselves in the area of mathe-
matics called number theory, which examines the properties of integers and other sets of
numbers. Once considered an area of strictly pure (abstract) mathematics, number theory is
now an essential applicable tool — especially, in dealing with computer and Internet secu-
rity. But for now, as we continue to examine the set Z* further, we notice that for all n € Zt
where n > 1, the integer n has at least two positive divisors, namely, 1 and n itself. Some
integers, such as 2, 3, 5, 7, 11, 13, and 17 have exactly two positive divisors. These inte-
gers are called primes. All other positive integers (greater than 1 and not prime) are called
composite. An immediate connection between prime and composite integers is expressed
in the following lemma.
LEMMA 4.1 Ifn € Z* and n is composite, then there is a prime p such that p|n.
Proof: If not, let S be the set of all composite integers that have no prime divisor(s). If S 4 @,
then by the Well-Ordering Principle, S has a least element m. But if m is composite, then
m = mymy, where m,, m2 € Zt with 1 <m, <m and 1 < mz <m. Since m, ¢ S, my is
prime or divisible by a prime —-so, there exists a prime p such that p|m,. Since m = mymp,
it now follows from part (d) of Theorem 4.3 that p|m, and so S = @.
Now why did we call the preceding result a Jemma instead of a theorem? After all, it had
to be proved like all other theorems in the book so far. The reason is that although a lemma
is itself a theorem, its major role is to help prove other theorems.
In listing the primes we are inclined to believe that there are infinitely many such num-
bers. We now verify that this is true.
THEOREM 4.4 (Euclid) There are infinitely many primes.
Proof: If not, let p;, p2,..., px be the finite list of all primes, and let B = p;p2--- py +1.
Since B > p, for all 1 <i <k, B cannot be a prime. Hence B is composite. So by Lemma
4.1 there is a prime p;, where 1 < j <k and p;|B. Since p;|B and p;|pip2--- pe, by
Theorem 4.3(e) it follows that p;|1. This contradiction arises from the assumption that
there are only finitely many primes; the result follows.
Yes, this is the same Euclid from the fourth century B.C. whose Elements, written on 13
parchment scrolls, included the first organized coverage of the geometry we studied in high
school. One finds, however, that these 13 books are also concerned with number theory. In
particular, Books VII, VII, and [X dwell on this topic. The preceding theorem (with proof)
is found in Book IX.
4.3, The Division Algorithm: Prime Numbers 223
We turn now to the major idea of this section. This result enables us to deal with nonzero
division in Z when that division 1s not exact.
THEOREM 4.5 The Division Algorithm. If a, b € Z, with b > 0, then there exist unique g,r € Z with
a=qb4+r,0<r<b.
Proof: If b|a the result follows with r = 0, so consider the case where b / a (that is, b does
not divide a).
Let S = {a —tbh|t ¢ Z,a —tb > 0}. Ifa > Oandt = 0, thena € S$ and $ # Y. Fora <
0, lett =a —1. Thena —th=a~—(a— 1)b=a(1—b) +5, with (1 — b) < 0, because
b>1.Soa—tb>Oand S # J. Hence, for any a € Z, S is anonempty subset of Z*. By
the Well-Ordering Principle, S$ has a least element r, where 0 <r = a — qb, for some
q € Z.\fr = b,thena = (g + 1)band bla, contradicting b / a. Ifr > b,thenr = b +c, for
somec€ Z*, anda -—qb=r=b+c>3c=a-—(q+l)be S, contradicting r being the
least element of S. Hence, r < b.
This now establishes a quotient g and remainder r, where 0 < r < b, for the theorem. But
are there other q’s and r’s that also work? If so, let g;, g2, 11, r2 © Zwitha = g,)b +17), for
O<r, <b, anda =qb4+n, forO0<r. <b. Then gyb4+r,; = qb+nmn=> bla; — @| =
lr2 — r}| < b, because 0 <1), ro < b. If) # qo, we have the contradiction b|g, — q2| < b.
Hence g; = 42, ’; = rz, and the quotient and remainder are unique.
As we mentioned in the preceding proof, when a, b € Z with b > 0, then there exists a
unique guotient q and a unique remainder r where a = qb +r, withO <r <b. Further-
more, under these circumstances, the integer b is called the divisor while a is termed the
dividend,
a) When a = 170 and 6 = 11 in the division algorithm, we find that 170 = 15-11 +5,
EXAMPLE 4.25
where 0 < 5 < 11. So when 170 is divided by 11, the quotient is 15 and the remainder
is 5.
b) If the dividend is 98 and the divisor is 7, then we find that 98 = 14 - 7. So in this case
the quotient is 14 and the remainder is 0, and 7 (exactly) divides 98.
c) For the case of a = —45 and b = 8 we have —45 = (—6)8+ 3, where 0 <3 < 8.
Consequently, the quotient is —6 and the remainder is 3 when the dividend is —45 and
the divisor is 8.
d) Leta, be Zt.
1) Ifa = qb for some g € Z*, then —a = (—q)b. So, in this case, when —a (< 0) is
divided by b (> 0) the quotient is —g (< 0) and the remainder is 0.
2) If a=qb+r for some geEN and 0<r<b, then —a=(-qg)b-r=
(-q)b —b+b—r=(—q -—1)b+(b~—r). For this case, when —a (<0) is
divided by b (> 0) the quotient is —g — 1 (<0) and the remainder is b—,r,
where 0 < b—r <b.
Despite the proof of Theorem 4.5 and the results in Example 4.25, we really do not have
any systematic way to calculate the quotient g and remainder r when we divide an integer a
(the dividend) by the positive integer b (the divisor). The proof of Theorem 4.5 guarantees
the existence of such integers g and r, but the proof is not constructive. It does not appear to
tell us how to actually calculate g and r, and it does not mention anything about the ability
to use multiplication tables or perform long division. To remedy this situation we provide
224 Chapter 4 Properties of the Integers: Mathematical Induction
the procedure (written in pseudocode) in Fig. 4.10. Our next example illustrates the idea
presented in part of this procedure.
procedure IntegerDivision (a, b: integers)
begin
if a=0 then
begin
quotient :=0
remainder :=0
end
else
begin
r:=abs(a)
{the absolute value of a}
gq :=0
whiler > bdo
begin
r:=r-b
Gq:=qil
end
if a> 0 then
begin
guotient :=q
remainder :=r
end
elseif r=0 then
begin
quotient :=-q
remainder :=0
end
else
begin
quotient :=-q-1
remainder :=b-r
end
end
end
Figure 4.10
Just as the multiplication of positive integers may be viewed as repeated addition, so too
EXAMPLE 4.26
can we view (integer) division as repeated subtraction. We see that subtraction does play a
role in the definition of the set S in the proof of Theorem 4.5.
When calculating 4 - 7, for example, we can think in terms of repeated addition and write
2-7=747=14
3-7=(241)-7=2-741-7=(747)4+7=
1447 =21
4-7=(341)-7=3°741-7=
(74747 +7 = 2147 = 28.
If, on the other hand, we wish to divide 37 by 8, then we should think of the quotient g as the
number of 8’s contained in 37. When each one of these 8’s is removed (that is, subtracted)
4.3 The Division Algorithm: Prime Numbers 225
and no other 8 can be removed without giving us a negative result, then the integer that 1s
left (remaining) is the remainder r. So we can calculate g and r by thinking in terms of
repeated subtraction as follows:
37 —8 = 29> 8,
29 — 8 = (37 — 8) —8 = 37 -2-8 =21>8,
21-8 = ((37 — 8) — 8) —8 = 37 -3-8=13>8,
13 ~ 8 = (((37 — 8) — 8) — 8) —-8 = 37 -4-8 =5 <8.
The last line shows that four 8’s can be subtracted from 37 before we obtain a nonnegative
result
— namely, 5 — that is smaller than 8. Therefore, in this example we have g = 4 and
r=5.
Using the division algorithm, we consider some results on representing integers in bases
other than 10.
Write 6137 in the octal system (base 8). Here we seek nonnegative integers rp, 71, r2, ... ,
EXAMPLE 4.27
r,, withO < rg < 8, such that 6137 = (7; -- + rariro)s.
With 6137 =rot+r -S+tr-8?+---4r,- 8 =r+ 8(r, +r -8t--- +r, + 8k),
ro is the remainder obtained in the division algorithm when 6137 is divided by 8.
Consequently, since 6137 = 1 + 8 - 767, we have rp = 1 and 767 =r; +7r2-8+---+
r, 8-1 =r, + 80 +73 -84--- +7, -8'-?). This yields rj = 7 (the remainder when
767 is divided by 8) and 95 = rp +7r3-8+--- +r, -8*~*. Continuing in this manner, we
findr2 =7, r3 = 3, rq = 1, andr; = 0 for alli > 5, so
6137 =1-8°4+3-8°+7-8°+7-841 = (1377):
We can arrange the successive divisions by 8 as follows:
Remainders
8 16137
8 |767 — 1(ro)
8 195 7(r1)
8 [il 7(r2)
8 [1 — 3¢r3)
0 I(r4)
In the field of computer science, the binary number system (base 2) is very important.
EXAMPLE 4.28
Here the only symbols that one may use are the bits 0 and 1. In Table 4.3 we have listed the
binary representations of the (base-10) integers from 0 to 15. Here we have included leading
zeros and find that we need four bits because of the leading 1 in the representations for the
integers from 8 to 15. With five bits we can continue up to 31 (= 32 — 1 = 2° — 1); six bits
are necessary to proceed to 63 (= 64 — 1 = 2° — 1). In general, if x €¢ Z and 0 < x < 2",
for n € Zt, then we can write x in base 2 by using n bits. Leading zeros appear when
O<x <2""!— 1, and for 2"~! <x <2” —1 the first (most significant) bit is 1.
Information is generally stored in machines in units of eight bits called bytes, so for
machines with memory cells of one byte we can store in a single cell any one of the binary
226 Chapter 4 Properties of the Integers: Mathematical Induction
Table 4.3
Base 10 Base 2 Base 10 Base 2
0 0000 8 1000
1 0001 9 1001
2 0010 10 1010
3 0011 11 1011
4 0100 12 1100
5 0101 13 1101
6 0110 14 1110
7 O111 15 1111
equivalents of the integers from 0 to 2° — 1 = 255. For a machine with two-byte cells, any
one of the integers from 0 to 2'® — 1 = 65,535 can be stored in binary form in each cell. A
machine with four-byte cells would take us up to 2°? — | = 4,294,967,295.
When a human deals with long sequences of 0’s and 1’s, the job soon becomes very
tedious and the chance for error increases with the tedium. Consequently, it is common (es-
pecially in the study of machine and assembly languages) to represent such long sequences
of bits in another notation. One such notation is the hexadecimal (base-16) notation. Here
there are 16 symbols, and because we have only 10 symbols in the standard base-10 system,
we introduce the following six additional symbols:
A (Alfa) C (Charlie) E_ (Echo)
B (Bravo) D (Delta) F (Foxtrot)
In Table 4.4 the integers from 0 to 15 are given in terms of both the binary and the hexadec-
imal number systems.
Table 4.4
Base 10 Base 2 Base 16 ~—— Base 10 Base 2 Base 16
0 0000 0 8 1000 8
1 0001 l 9 1001 9
2 0010 2 10 1010 A
3 0011 3 11 1011 B
4 0100 4 12 1100 Cc
5 0101 5 13 1101 D
6 0110 6 14 1110 E
7 0111 7 15 1111 F
To convert from base 10 to base 16, we follow a procedure like the one outlined in Example
4.27. Here we are interested in the remainders upon successive divisions by 16. Therefore,
if we want to represent the (base-10) integer 13,874,945 in the hexadecimal system, we do
the following calculations:
4.3 The Division Algorithm: Prime Numbers 227
Remainders
16 | 13,874,945
16 | 867,184 1 (70)
16 |54,199 0 (71)
16 |3,387 7 (r2)
16 [211 11(=B)_ (73)
16 [13 3 (ra)
0 13(=D) (7s)
Consequently, 13,874,945 = (D3B701)\6.
There is, however, an easier approach for converting between base 2 and base 16. For
example, if we want to convert the binary (one-byte) integer 01001101 to its base-16 coun-
terpart, we break the number into blocks of four bits:
0100
—— —
1101
4 D
We then convert each block of four bits to its base-16 representation (as shown in Table 4.4),
and we have (01001101). = (4D) 16. If we start with the (two-byte) number (A13F)j6 and
want to convert in the other direction, we replace each hexadecimal symbol by its (four-bit)
binary equivalent (also as shown in Table 4.4):
A 1 3 F
ma, eve nmen, ao, Pout
1010 0001 OO11 41111
This results in (A13F)j¢ = (1010000100111111)>.
We need negative integers in order to perform the binary operation of subtraction in terms of
EXAMPLE 4.29
addition [that is, (a — b) = a + (—b)]. When we are dealing with the binary representation
of integers, we can use a popular method that enables us to perform addition, subtraction,
multiplication, and (integer) division: the two’s complement method. The method’s popu-
larity rests on its implementation by only two electronic circuits — one to invert and the
second to add.
In Table 4.5 the integers from —8 to 7 are represented by the four-bit patterns shown.
The nonnegative integers are represented as they were in Tables 4.3 and 4.4. To obtain the
results for —8 <n < —1, first consider the binary representation of |n|, the absolute value
of n. Then do the following:
1) Replace each 0(1) in the binary representation of |n| by 1(0); this result is called the
one’s complement of (the given representation of ) |n|.
2) Add 1 (= 0001 in this case) to the result in step (1). This result is called the two's
complement of n.
For example, to obtain the two’s complement (representation) of —6, we proceed as
follows.
228 Chapter 4 Properties of the Integers: Mathematical Induction
6
1) Start with the binary 1
representation of 6. 0110
2) Interchange the 0’s and 1’s; this 1
result is the one’s complement of 0110. 1001
3) Add 1 to the prior result. 1
1001 + 0001 = 1010
We can also obtain the four-bit patterns for the values —8 <n < —1 by using the four-
bit patterns for the integers from 0 to 7 and complementing (interchanging 0’s and 1’s) these
patterns as shown by four such pairs of patterns in Table 4.5. Note in Table 4.5 that the
four-bit patterns for the nonnegative integers start with 0, whereas | is the first bit for the
negative integers in the table.
Table 4.5
Two’s Complement Notation
Value Represented Four-Bit Pattern
7 O 1 1 1 «—
6 Oo 1 1 0
5 0 1 O 1 <—
4 0 1 O 0
3 0 oO 1 1
2 0 0 | 0
1 0 0 O 1
0 0 0 O 0 ‘
—1 1 1 1 I
—2 1 1 1 0
—3 1 1 QO 1
—4 1 1 O 0)
—5 1 oO 1 1
—6 1 Oo 1 0 <«—
—7 1 0 O 1
—8 1 0 0O 0 «—
EXAMPLE 4.30 How do we perform the piahaanen 33 — 15 in base 2, using the two’s complement method
with patterns of eight bits (= one byte)?
We want to determine 33 — 15 = 33 + (—15). We find that 33 = (00100001)>, and 15 =
(00001111).. Therefore we represent —15 by
11110000 + 00000001 = 11110001.
The addition of integers represented in two’s complement notation is the same as ordinary
binary addition, except that all results must have the same size bit patterns. This means that
when two integers are added by the two’s complement method, any extra bit that results on
the left of the answer (by a final carry) must be discarded. We illustrate this in the following
calculations.
4.3. The Division Algorithm: Prime Numbers 229
00100001
— 15 + 11110001
100010010
ee poe
Answer = (00010010), = 18
This bit is
discarded. * This bit indicates that
the answer is nonnegative.
To find 15 — 33 we use 15 = (00001111)2 and 33 = (00100001)2. Then, to calculate
15 — 33 as 15 + (—33), we represent —33 by 11011110 + 00000001 = 11011111. This
gives us the results
15 00001111
— 33 + 11011111
11101110
t___ This bit indicates that
the answer is negative.
In order to get the positive form of the answer, we proceed as follows:
11101110
1) Take the one’s +
complement. 00010001
2) Add | to the +
prior result. 00010010
Since (00010010). = 18, the answer is —18.
One problem we have avoided in the two preceding calculations involves the size of the
integers that we can represent by eight-bit patterns. No matter what size patterns we use,
the size of the integers that can be represented is limited. When we exceed this size, an
overfiow error results. For example, if we are working with eight-bit patterns and try to add
117 and 88, we obtain
117 01110101
+ 88 + 01011000
11001101
* This bit indicates that
the answer is negative.
This result shows how we can detect an overflow error when adding two numbers. Here
an overflow error is indicated: The sum of the eight-bit patterns for two positive integers
has resulted in the eight-bit pattern for a negative integer. Similarly, when the addition of
(the eight-bit patterns of) two negative integers results in the eight-bit pattern of a positive
integer, an overflow error is detected.
To see why the procedure in Example 4.30 works in general, let x, y € Z* with x > y.
Let 2”~! < x < 2”. Then the binary representation for x is made up of n bits (with the
leading bit 1). The binary representation for 2” consists of n + 1 bits: aleading bit 1 followed
by n 0’s. The binary representation for 2” — 1 consists ofx 1’s.
When we subtract y from 2” — 1, we have
(2"” —1)—y=11...1-—y, the one’s complement of y.
nl’s
230 Chapter 4 Properties of the Integers: Mathematical Induction
Then (2” — 1) — y + 1 gives us the two’s complement of —y, and
x—y=x+[(Q2"-1)-y4+1]-2"’,
where the final term, —2”, results in the removal of the extra bit that arises on the left of the
answer.
We close this section with one final result on composite integers.
EXAMPLE 4.31 _| If n € Z* and n is composite, then there exists a prime p such that p|n and p < /n.
Proof: Since ” is composite, we can write » = n\n2, where 1 <n, <n and 1 <n <n.
We claim that one of the integers n;,2 must be less than or equal to Jn. If not, then
n, > ./n and n2 > ./n give us the contradiction n = njn2 > (./n)(./n) = n. Without loss
of generality, we shall assume that n, < /n. If n; is prime, the result follows. If nj is not
prime, then by Lemma 4.1 there exists a prime p <n; where p|n;. So p|n and p < ./n.
c)a=0, b=42 d) a = 434, b=3)
EXERCISES 4.3
13. Ifn EN, prove that 3|(7” — 4”).
1. Verify the remaining parts of Theorem 4.3. 14, Write each of the following (base-10) integers in base 2,
2. Let a,b,c,d€Z*. Prove that (a) [(a]b) A (cld)] => base 4, and base 8.
ac\|bd; (b) a|b => ac|bc; and (c) ac|bc = alb. a) 137 b) 6243 c) 12,345
3. If p, g are primes, prove that p|q if and only if p = q. 15. Write each of the following (base-10) integers in base 2 and
4. Ifa, b, c€ Z* and albc, does it follow that a|b or alc? base 16,
5. For all integers a, b, and c, prove that ifa / bc, thena J b a) 22 b) 527 c) 1234 d) 6923
anda jc. 16. Convert each of the following hexadecimal numbers to base
6. Let n €Z* where n>2. Prove that if a), a,..-.. Qn, 2 and base 10.
bi, bo,...,b,
€Z* and a,|b, for all 1<i<n, then a) A7 b) 4C2 c) 1C2B d) A2DFE
(az ++ - An)|(b1b2 + + + By). 17. Convert each of the following binary numbers to base 10
7. a) Find three positive integers a, b, c such that and base 16.
31|(S5a + 7b + 11c). a) 11001110 b) 00110001
b) If a,b,ce€Z and 31\(5a+7b+4+ 11c), prove that c) 11110000 d) 01010111
31\(21a + 176+ 9c).
18. For what base do we find that 251 + 445 = 1026?
8. Agrocery store runs a weekly contest to promote sales. Each
19, Find all n € Z* where n divides 5n + 18.
customer who purchases more than $20 worth of groceries re-
ceives a game card with 12 numbers on it; if any of these num- 20. Write each of the following integers in two’s complement
bers sum to exactly 500, then that customer receives a $500 representation. Here the results are eight-bit patterns.
shopping spree (at the grocery store). After purchasing $22.83
a) 15 b) -—15 ¢) 100
worth of groceries at this store, Eleanor receives her game card
on which are printed the following 12 numbers: 144, 336, 30, d) —65 e) 127 f) —128
66, 138, 162, 318, 54, 84, 288, 126, and 456. Has Eleanor won 21. If a machine stores integers by the two’s complement
a $500 shopping spree? method, what are the largest and smallest integers that it can
9, Let a,b €Z*. If bla and b\(a +2), prove that b = 1 or store if it uses bit patterns of (a) 4 bits? (b) 8 bits? (c) 16 bits?
(d) 32 bits? (e) 2" bits, n € Z*?
b=2,
10. Ifn € Z*, and n is odd, prove that 8|(n? — 1). 22. In each of the following problems, we are using four-bit
patterns for the two’s complement representations of the inte-
11. If a, b € Z*, and both are odd, prove that 2|(a? + 6?) but
gers from —8 to 7. Solve each problem (if possible), and then
4} (a? +b’).
convert the results to base 10 to check your answers. Watch for
12. Determine the quotient g and remainder r for each of the any overflow errors.
following, where a is the dividend and b is the divisor.
a) 0101 b) 1101
a)a=23, b=7 b)a=-—-115, b=12 + 0001 + 1110
4.4 The Greatest Common Divisor: The Euclidean Algorithm 231
ce) O11) d) 1101 28. Define the set X C Z* recursively as follows:
+ 1000 + 1010
1) 3€ X; and
23. Ifa, x, y € Z, anda # 0, prove thatax =ay > x = y.
2) Ifa, be X,thena +be Xx.
24. Write acomputer program (or develop an algorithm) to con-
vert a positive integer in base 10 to base 6, where 2 < b < 9. Prove that X = {3k|k € Z*}, the set of all positive integers di-
25. The Division Algorithm can be generalized as follows: visible by 3.
For a,b¢€Z,b #0, there exist unique g,r €Z with a= 29. Letn
€ Z* withn =r, - 10 +---+75-10?
+7, - 104 79
qb+r,0<r < |b|. Using Theorem 4.5, verify this generalized (the base-10 representation of n), Prove that
form of the algorithm for b < 0.
a) 2|n if and only if 2|ro
26. Write a computer program (or develop an algorithm) to
convert a positive integer in base 10 to base 16. b) 4|n if and only if 4|(r; - 10+ 79)
27. For n € Z*, write a computer program (or develop an al- c) 8|z if and only if 8|(r2 - 10? +7, - 10 +79)
gorithm) that lists all positive divisors of n. State a general theorem suggested by these results.
4.4
The Greatest Common Divisor:
The Euclidean Algorithm
Continuing with the division operation developed in Section 4.3, we turn our attention to
the divisors of a pair of integers.
Definition 4.2 For a, b € Z, a positive integer c is said to be a common divisor of a and b if cla and c|b.
EXAMPLE 4.32 theoumon divisors of 42 and 70 are 1, 2,7, and 14, and 141s the greatest of the common
Definition 4.3 Let a, b € Z, where either a 4 O orb # 0. Thenc € Z? is called a greatest common divisor
of a, bif
a) cla and c|b (that is, c is a common divisor of a, b), and
b) for any common divisor d ofa and b, we have dc.
The result in Example 4.32 satisfies these conditions. That is, 14 divides both 42 and 70,
and any common divisor of 42 and 70 — namely, 1, 2, 7, and 14— divides 14. However, this
example deals with two small integers. What would we do with two integers each having
20 digits? We consider the following questions.
1) Given a, b € Z, where at least one of a, b is not 0, does a greatest common divisor
of a and } always exist? If so, how does one find such an integer?
2) How many greatest common divisors can a pair of integers have?
In dealing with these questions, we concentrate on a, b € Zt.
THEOREM 4.6 For all a, b € Z", there exists a unique c € Z* that is the greatest common divisor of a, b.
Proof: Givena, b < Z*, let S = {as + bt|s, t € Z, as + bt > 0}. Since S # @, by the Well-
Ordering Principle S has a least element c. We claim that c is a greatest common divisor of
a, b.
232 Chapter 4 Properties of the Integers: Mathematical Induction
Since c € S,c = ax + by, for some x, y € Z. Consequently, ifd € Z and dla and d|b,
then by Theorem 4.3(f) d|(ax + by), so dlc.
If c { a, we can use the division algorithm to write a = gc +r, withg, r € Z* and0 <
r<c.Thenr =a—qce=a-—gq(ax + by) = (1 —qx)a+ (—qy)b,sor € S, contradicting
the choice of c as the least element of S. Consequently, c|a, and by a similar argument, c|b.
Hence all a, b € Z* have a greatest common divisor. If c,, cz both satisfy the two con-
ditions of Definition 4.3, then with c; as a greatest common divisor, and cz as a common
divisor, it follows that c2|c,. Reversing roles, we find that c)|c2, and so we conclude from
Theorem 4.3(b) that c) = co because c), co € Zt.
We now know that for all a, b € Z*, the greatest common divisor of a, b exists — and it
is unique. This number will be denoted by gcd(a, b). Here gcd(a, b) = ged(b, a); and for
each a € Z, ifa # 0, then gcd(a, 0) = |a|. Also when a, b € Zt, we have gcd(—a, b) =
gcd(a, —b) = gcd(—a, —b) = ged(a, b). Finally, gcd(0, 0) is not defined and is of no in-
terest to us.
From Theorem 4.6 we see that not only does gcd{a, b) exist but that gcd(a, b) is also
the smallest positive integer we can write as a linear combination of a and b. However,
we must realize that if a, b,c € Z* and c = ax + by for some x, y € Z, then we do not
necessarily know that c is gcd(a, b) — unless we somehow also know that c is the smallest
positive integer that can be written as such a linear combination of a and b.
Finally, integers a and b are called relatively prime when gcd(a, b) = 1 —that is, when
there exist x, y € Z with ax + by = 1.
Since gced(42, 70) = 14, we can find x, y € Z with 42x + 70y = 14, or 3x +5y = 1. By
EXAMPLE 4.33
inspection,x = 2, y = —1 is asolution; 3(2) + 5(—1) = 1. But fork € Z, 1 = 3(2 — 5k) +
5(—1 + 3k), so 14 = 42(2 — 5k) + 70(~—1 + 3k), and the solutions for x, y are not unique.
In general, if gcd(a, b) = d, then gcd({a/d), (b/d)) = 1. (Verify this!) If (a/d)xo +
(b/d)yo = 1, then 1 = (a/d)(% — (b/d)k) + (b/d)(yo + (a/d)k), for each k € Z. Sod =
a(xo — (b/d)k) + b(yo + (a/d)k), yielding infinitely many solutions to ax + by = d.
The preceding example and the prior observations work well enough when a, b are
fairly small. But how does one find ged(a, b) for some arbitrary a, b € Z*? If alb, then
gcd(a, b) = a; and if bla, then gcd(a, b) = b — otherwise, we turn to the following result,
which we owe to Euclid.
THEOREM 4.7 Euclidean Algorithm. Leta, b € Z*. Setry = a andr; = band apply the division algorithm
n times as follows:
ro = git, +f, Q<r<ry
ry =qro+nrs, Q0<7r3<1r2
ro = q3r3+7r4, O<1r4< 13
Me = Gi4ihig +142, O< rig. <Vi41
Yn—-2 = Gn-1hn-1 +T np O< ry <Pry-]
Pu-1 = Gnlpn.
Then r,, the last nonzero remainder, equals gcd{a, b).
4.4 The Greatest Common Divisor: The Euclidean Algorithm 233
Proof: To verify that r, = gcd(a, b), we establish the two conditions of Definition 4.3.
Start with the first division process listed (where ro = a and r; = b). If c|ro and clr),
then as ro = gir) +72, it follows that clr.. Next [(clr;) A (clr2)] > clr3, because r; =
gor2 + r3. Continuing down through the division processes, we get to where c|r,_2 and
c|rn;-1. From the next-to-last equation, we conclude that c|r,, and this verifies condition
(b) of Definition 4.3.
To establish condition (a) we go in reverse order. From the last equation, r;,|r,—;, and
SO rp |rn—2, because ryz-2 = Gn—1%n—| + Pn. Continuing up through the equations, we get to
where r,|rg and r,|73, SO r,|ro. Then [(7,|73) A (fnlr2)] > ral (that is, r,|b), and finally
[(rnl72) A (nlr1)] => ralro, (that is, 7,|a). Hence r, = ged(a, 5).
We have now used the word algorithm in describing the statements set forth in Theorems
4.5 and 4.7. This term will recur frequently throughout other chapters of this text, so it may
be a good idea to consider just what it connotes.
First and foremost, an algorithm is a list of precise instructions designed to solve a
particular type of problem — not just one special case. In general, we expect all of our
algorithms to receive input and provide the needed result(s) as output. Also, an algorithm
should provide the same result whenever we repeat the value(s) for the input. This happens
when the list of instructions is such that each intermediate result that comes about from the
execution of each instruction is unique, depending on only the (initial) input and on any
results that may have been derived at any preceding instructions. In order to accomplish
this any possible vagueness must be eliminated from the algorithm; the instructions must
be described in a simple yet unambiguous manner, a manner that can be executed by a
machine. Finally, our algorithms cannot go on indefinitely. They must terminate after the
execution of a finite number of instructions.
In Theorem 4.7 we are confronted with the determination of the greatest common divisor
of any two positive integers. Hence this algorithm receives the two positive integers a, b
as its input and generates their greatest common divisor as the output.
The use of the word algorithm in Theorem 4.5 is based on tradition. As stated, it does not
provide the precise instructions we need to determine the output we want. (We mentioned
this fact prior to Example 4.26.) To eliminate this shortcoming of Theorem 4.5, however,
we set forth the instructions in the pseudocode procedure of Fig. 4.10.
We now apply the Euclidean algorithm in the following five examples.
Find the greatest common divisor of 250 and 111, and express the result as a linear combi-
EXAMPLE 4.34
nation of these integers.
250 =2(111) +28, 0O<28<111
111 = 3(28) +27, 0 <27 <28
28 = 1(27) +1, 0<1<27
27 = 27(1) +0.
So 1 is the last nonzero remainder. Therefore gcd(250, 111) = 1, and 250 and 111 are
relatively prime. Working backward from the third equation, we have 1 = 28 — 1(27) =
28 — 1[111 — 3(28)] = (-Dd1)) + 4(28) = (—1) 111) 4+ 4[250 — 2(111)] = 4(250) —
9(111) = 250(4) + 111(—9), a linear combination of 250 and 111.
This expression of 1 as a linear combination of 250 and 111 is not unique, for 1 =
250[4 — 111k] + 111[—9 + 250k], for any k € Z.
234 Chapter 4 Properties of the Integers: Mathematical Induction
We also have
gcd(—250, 111) = ged(250, —111) = ged(—250, —111) = ged(250, 111) = 1.
Our next example is somewhat more general, as it concerns the greatest common divisor
for an infinite number of pairs of integers.
For any n € Z*, prove that the integers 87 + 3 and 5n + 2 are relatively prime.
EXAMPLE 4.35
When n = 1 we find that gcd(8n + 3, 5n + 2) = ged(11, 7) = 1.
For n > 2 we have 8n + 3 > 5n + 2, and as in the previous example, we may write
8n +3 =1(5n+2)4+B3n+1), O<3n+1<5n42
5n+2 = 13n+1)+(Qn4+1), 0<2n+1<3n4+1
3n+1=1(2n4+ 1) +n, O<n<2n4+]
2n+1=2(n)+4+1, O<l<n
n=n(1)+0.
Consequently, the last nonzero remainder is 1, so gcd(8n + 3, 5n + 2) = 1 for all n > 1.
But we could also have arrived at this conclusion if we had noticed that
(8n + 3)(—5) + (Sn + 2)(8) = -154+ 16 = 1.
And since | is expressed as a linear combination of 8n +3 and 5n + 2, and no smaller
positive integer can have this property, it follows that the greatest common divisor of
8n + 3 and 5n + 2 is 1, for any positive integer n.
At this point we shall use the Euclidean algorithm to develop a procedure (in pseudocode)
EXAMPLE 4.36
that will find ged(a, b) for all a, b € Z*. The procedure in Fig. 4.11 employs the binary
operation mod, where for x, y € Z*, x mod y = the remainder after x is divided by y. For
example, 7 mod 3 is 1, and 18 mod 5 is 3. (We shall deal with “the arithmetic of remainders”
in more detail in Chapter 14.)
procedure gcd(a,b: positive integers)
begin
ri:i=amodb
d:=b
while r > 0 do
begin
c:=d
d:=r
r:=cmodd
end
end {gcd(a,b) isd, the last nonzero remainder}
Figure 4.11
Meanwhile, if we call this procedure for a = 168 and b = 456, the procedure first as-
signs r the value 168 mod 456 = 168 and d the value 456. Since r > 0 the code in the
while loop is executed (for the first time) and we obtain the following: c = 456, d = 168,
4.4 The Greatest Common Divisor: The Euclidean Algorithm 235
r = 456 mod 168 = 120. We then find that the code in the while loop is executed three
more times with the following results:
(2nd pass): c = 168, d = 120, r = 168 mod 120 = 48
(3rd pass): c= 120,d= 48,r = 120mod48 = 24
(4th pass): c= 48,d= 24,r= 48mod24 = 0.
Since r is now QO, the procedure tells us that gcd(a, b) = gcd(168, 456) = 24, the final
value of d (the last nonzero remainder).
Griffin has two unmarked containers. One container holds 17 ounces and the other holds
EXAMPLE 4.37
55 ounces. Explain how Griffin can use his two containers to measure exactly one ounce.
From the Euclidean algorithm we find that
55 = 3(17) +4, 0<4<17
17 =4(4) 4+ 1, 0<1<4.
Therefore 1 = 17 ~— 4(4) = 17 — 4[55 — 3(17)] = 13(17) — 4(55). Consequently, Griffin
must fill his smaller (17-ounce) container 13 times and empty the contents (for the first 12
times) into the larger container. (Griffin empties the larger container whenever it is full.)
Before he fills the smaller container for the thirteenth time, Griffin has 12(17) — 3(55) =
204 — 165 = 39 ounces of water in the larger (55-ounce) container. After he fills the smaller
container for the thirteenth time, he will empty 16 (= 55 — 39) ounces from this container,
filling the larger container. Exactly one ounce will be left in the smaller container.
Assisting students in programming classes, Brian finds that on the average he can help a
EXAMPLE 4.38
student debug a Java program in six minutes, but it takes 10 minutes to debug a program
written in C++. If he works continuously for 104 minutes and doesn’t waste any time, how
many programs can he debug in each language?
Here we seek integers x, y>0, where 6x + 10y = 104, or 3x +5y =52. As
gcd(3, 5) = 1, we can write 1 = 3(2) + 5(~1), so 52 = 3(104) + 5(—52) = 3(104 ~ 5k)
+ 5(—52 + 3k), k € Z. In order to obtain 0 < x = 104 — 5k and 0 < y = —52+4 3k, we
must have (52/3) < k < (104/5). Sok = 18, 19, 20 and there are three possible solutions:
a) (kK=18): x=14, y=2 b) (K=19): x =9, y=5
c) (kK=20): x=4, y=8
The equation in Example 4.38 is an example of a Diophantine equation: a linear equa-
tion requiring integer solutions. This type of equation was first investigated by the Greek
algebraist Diophantus, who lived in the third century A.D.
Having solved one such equation, we seek to discover when a Diophantine equation has
a solution. The proof is left to the reader.
THEOREM 4.8 Ifa, b, c € Z*, the Diophantine equation ax + by = c has an integer solution x = xo, y =
yo if and only if gcd(a, b) divides c.
We close this section with a concept that is related to the greatest common divisor.
236 Chapter 4 Properties of the Integers: Mathematical Induction
Definition 4.4 For a, b, c€ Z*, c is called a common multiple of a, b if c is a multiple of both a and
b. Furthermore, c is the least common multiple of a, b if it is the smallest of all positive
integers that are common multiples of a, b. We denote c by Icm(a, b).
If a, b € Z*, then the product ab is acommon multiple of both a and b. Consequently,
the set of all (positive) common multiples of a, b is nonempty. So it follows from the
Well-Ordering Principle that the lem(a, b) does exist.
EXAMPLE 4.39 a) Since 12 = 3 - 4 and no other smaller positive integer is a multiple of both 3 and 4, we
: have Iem(3, 4) = 12 = Icem(4, 3). However, Icem(6, 15) # 90 — for although 90 is a
multiple of both 6 and 15, there is a smaller multiple, namely, 30. And since no other
common multiple of 6 and 15 is smaller than 30, it follows that lem(6, 15) = 30.
b) For all n € Z*, we find that Iem(1, 7) = Iem(n, 1) =n.
c) Whena, n € Z*, wehave Iem(a, na) = na. [This statement is a generalization of part
(b). The earlier statement follows from this one when a = 1.]
d) Ifa, m,n € Zt with m <n, then lem(a”, a”) = a”. [And gcd(a™, a”) = a]
THEOREM 4.9 Let a, b,c € Z*, with c = lem(a, b). Ifd is acommon multiple of a and b, then cld.
Proof: If not, then because of the division algorithm we can write d = gc +r, where
0<r<c. Since c = lcm(a, b), it follows that c = ma for some m € Z*. Also, d = na for
some n € Z*, because d is a multiple of a. Consequently, na = gma +r > (n — qm)a =
r > 0, and r is a multiple of a. In a similar way r is seen to be a multiple of b, so r is a
common multiple of a, b. But with 0 < r < c, we contradict the claim that c is the least
common multiple of a, b. Hence cld.
Our last result for this section ties together the concepts of the greatest common divisor
and the least common multiple. Furthermore, it provides us with a way to calculate Iem(a, b)
for all a, b € Zt. The proof of this result is left to the reader.
THEOREM 4.10 For all a, b € Z*, ab = Iem(a, b) - gcd(a, b).
EXAMPLE 4.40 By virtue of Theorem 4.10 we have the following:
a) For all a, b € Z", if a, b are relatively prime, then lem(a, b) = ab.
b) The computations in Example 4.36 establish the fact that gcd(168, 456) = 24. As a
result we find that
168) (456
Icm(168, 456) = ae aca = 3,192.
2. For a,be€Z* and s,té€Z, what can we say about
EXERCISES 4.4 ecd(a, b) if
1. For each of the following pairs a,b¢Z*, determine
gced(a, b) and express it as a linear combination of a, b. a) as + bt = 2? b) as + bt = 3?
a) 231, 1820 b) 1369,2597 ce) 2689, 4001 c) as + bt =4? d) as + bt = 6?
4.5 The Fundamental Theorem of Arithmetic 237
3. Fora, b € Zt andd = gcd(a, b), prove that 12. Let a,be€Z* where a>b. Prove that gcd(a, b) =
gcd(a — b, b).
af2)
ed{—,—]=1. 4
BON" d 13. Prove that for any n € Z*, gcd(5n + 3, In +4) = 1.
4, Fora, b,n € Z*, prove that gcd(na, nb) = n ged(a, b). 14, An executive buys $2490 worth of presents for the children
5. Leta, b,c € Z* with c = ged(a, b). Prove that c* of her employees. For each girl she gets an art kit costing $33;
divides ab, each boy receives a set of tools costing $29. How many presents
6. Letn € Z*. of each type did she buy?
a) Prove that ged(n, n + 2) = 1 or 2. 15, After a weekend at the Mohegan Sun Casino, Gary finds
b) What possible values can gcd(n, n +3) have? What that he has won $1020 — in $20 and $50 chips. If he has more
about gcd(n, n + 4)? $50 chips than $20 chips, how many chips of each denomination
could he possibly have?
c) Ifk € Z*, what can we say about ged(n, n + k)?
16. Let a,b € Z*. Prove that there exist c, d € Z* such that
7. Fora, b,c, d € Z*, prove that ifd = a + bc, then
cd = aand ged(c, d) = b if and only if b?|a.
gecd(h, d) = gcd(a, b).
17. Determine those values of c € Z*, 10 <c < 20, for which
8. Let a, b, c€ Z* with gcd(a, b) = 1. If alc and b\c, prove the Diophantine equation 84x + 990y =c has no solution.
that ab|c. Does the result hold if gcd(a, b) # 1? Determine the solutions for the remaining values of c.
9. Leta, b € Z, where at least one of a, b is nonzero.
18. Verify Theorems 4.8 and 4.10.
a) Using quantifiers, restate the definition for c=
19. Ifa, b < Z* with a = 630, gcd(a, b) = 105, and
gcd(a, b), where c € Z*.
Iem(a, b) = 242, 550, what is b?
b) Use the result in part (a) in order to decide when
20. For each pair a, b in Exercise 1, find Iem(a, B).
c # gcd(a, b) for some c € Z*.
10. If a, b are relatively prime and a > b, prove that 21. For each n € Z*, what are gcd(n, n + 1) and
ged(a
— b,a+b) = lor?. Iem(n,2 +1)?
Ul. Leta, b, c € Z* with gcd(a, b) = 1.Ifalbe, prove that alc. 22. Prove that lem(na, nb) = n Iem(a, b) foralln,a, be Z.
45
The Fundamental Theorem of Arithmetic
In this section we extend Lemma 4.1 and show that for each n € Z*,n > 1, either n is
prime or 7 can be written as a product of primes, where the representation is unique up to
order. This result, known as the Fundamental Theorem of Arithmetic, can be found in an
equivalent form in Book IX of Euclid’s Elements.
The following two lemmas will help us accomplish our goal.
LEMMA 4.2 Ifa, b € Z* and p is prime, then p|ab > pla or p\b.
Proof: If p|a, then we are finished. If not, then because p is prime, it follows that gcd(p, a) =
1, and so there exist integers x, y with px + ay = 1. Then b = p(bx) + (ab)y, where p|p
and p|ab. So it follows from parts (d) and (e) of Theorem 4.3 that p|b.
LEMMA 4.3 Let a; € Z* for all 1 <i <n. If pis prime and plajaz - - - ay, then pla; for some 1 <i <n.
Proof: We leave the proof of this result to the reader.
Using Lemma 4.2 we now have another opportunity to establish a result by the method
of proof by contradiction.
238 Chapter 4 Properties of the Integers: Mathematical Induction
We want to show that s/2 is irrational.
| EXAMPLE 4.41 If not, we can write /2 = a/b, where a, b € Z* and gced(a, b) = 1. Then /2 = a/b>
2 = a’/b? > 2b* = a? > 2\|a* > 2I\a. (Why?) Also, 2/a > a = 2c for some c € Z*, so
2b? = a® = (2c)* = 4c? and b* = 2c”. But then 2|b? > 2|b. Since 2 divides both a and
b, it follows that gcd(a, b) > 2 —but this contradicts the earlier claim that ged(a, b) =
1. [Note: The preceding proof for the irrationality of /2 was known to Aristotle (384—
322 B.C.) and is similar to that given in Book X of Euclid’s Elements. ]
Before we turn to the main result for this section, let us point out that the integer 2 in
the preceding example is not that special. The reader will be asked to show in the Section
Exercises that in fact ,/p is irrational for every prime p. Now that we have mentioned this
fact, it is time to present the Fundamental Theorem of Arithmetic.
THEOREM 4.11 Every integer n > 1 can be written as a product of primes uniquely, up to the order of the
primes. (Here a single prime is considered a product of one factor.)
Proof: The proof consists of two parts: The first part covers the existence of a prime factor-
ization, and the second part deals with its uniqueness.
If the first part is not true, let m > 1 be the smallest integer not expressible as a product
of primes. Since m is not a prime, we are able to write m = m mp», where 1 < m, <m,
1 < m2 < m. But then m , m2 can be written as products of primes, because they are less
than m. Consequently, with m = mm we can obtain a prime factorization of m.
In order to establish the uniqueness of a prime factorization, we shall use the alternative
form of the Principle of Mathematical Induction (Theorem 4.2). For the integer 2, we have
a unique prime factorization, and assuming uniqueness of representation for 3, 4,5,...,
n — 1, we suppose that n = pj! p;’--- pi’ = qi'q? --- qt, where each p;, 1 <i <k, and
each qj, 1 <j <r, is a prime. Also p; < po <-++- < px, and qi <q. <--- <q,, and
s,>QOforalll <i<k,t;>Oforalll<j<r.
Since p,|n,we have Pilg; 4x +g}.By Lemma 4.3, p|g; forsome 1 < j <r. Because
p; and g; are primes, we have p; = q;. In factj = 1, for otherwise qi|n > q; = p, for
some 1<e<k and p; < pp = qi < qj = pi. With p; = q, we find that ny = n/p, =
pi | pe pk = qi? -.- qi", Since ny <n, by the induction hypothesis it follows
that
k =r, p, = q; for 1 <i<k,s; -1l=t —1 (so5s; =f), and s; =t; for
2 <i <k.
Hence the prime factorization of 7 is unique.
This result is now used in the following five examples.
For the integer 980,220 we can determine the prime factorization as follows:
EXAMPLE 4.42
980,220 = 2'(490,110) = 27(245,055) = 273'(81,685) = 273!5!(16,337)
= 2°3'5117'(961) = 27-3-5-17-31?
Suppose thatn € Z* and that
EXAMPLE 4.43
(*) 10:9-8-7-6-5-4-3-2-n=21-20-19-18-17-16-15-
14.
Since 17 is a prime factor of the integer on the right-hand side of Eq. (*) it must also
be a factor for the left-hand side (by the uniqueness part of the Fundamental Theorem of
4.5 The Fundamental Theorem of Arithmetic 239
Arithmetic). But 17 does not divide any of the factors 10,9, 8,..., 3 or 2, so it follows that
17\|n. (A similar argument shows us that 19|7).
For n € Z*, we want to count the number of positive divisors of n. For example, the number
EXAMPLE 4.44
2 has two positive divisors: | and itself. Likewise, 1 and 3 are the only positive divisors of
3. In the case of 4, we find the three positive divisors 1, 2, and 4.
To determine the result for each n € Z*,n > 1, we use Theorem 4.11 and write n =
Pi p> --- py’, where for each 1 <i <k, p, is a prime and e; > 0. If m|n, then m =
pi pe vee pf, where 0 < f; <e, for all 1 <i <k. So by the rule of product, the num-
ber of positive divisors of 1 is
(e+ l(en + 1)--- (a, +1).
For example, since 29,338,848,000 = 2°3°5°7°11, we find that 29,338,848,000 has
(84+ 19541034184 1d 4+ 1D = (9)(6)(4)(4)(2) = 1728 positive divisors.
Should we want to know how many of these 1728 divisors are multiples of 360 = 2°3°5,
then we must realize that we want to count the integers of the form 2" 3°5°7% 11% where
3<% <8, 2<t <5, L<t <3, O<t%4 <3, and O<t5 <1.
Consequently, the number of positive divisors of 29,338,848,000 that are divisible by
360 is
[(8 — 3) + 1[(5 — 2) + 1]LG3 — 1) + 1G — 9) + 1G - 0) + 1]
= (6)(4)(3)(4)(2) = 576.
To determine how many of the 1728 positive divisors of 29,338,848,000 are perfect
squares, we need to consider all divisors of the form 2°! 3°?5°} 7% 11%, where each of 5, 52, 53,
S4, S5 iS an Even nonnegative integer. Consequently, here we have
5 choices for s; — namely, 0, 2, 4, 6, 8;
3 choices for sz — namely, 0, 2, 4;
2 choices for each of 53, s; —namely, 0, 2; and
1 choice for ss; — namely, 0.
It then follows that the number of positive divisors of 29,338,848,000 that are perfect
squares is (5)(3)(2)(2)(1) = 60.
For our next example we shall need the multiplicative counterpart of the Sigma-notation
(for addition) that we first observed in Section 1.3. Here we use the capital Greek letter 1
for the Pi-notation.
We can use the Pi-notation to express the product x)%2%3x4%x5x6, for example, as I] 7 X,.
In general, one can express the product of the n — m + 1 terms X, X41, Xm42,---++Xns
where m,n € Z and m <n, as |]"_,, x,. AS with the Sigma-notation the letter i is called
the index of the product, and here this index accounts for all » — m + 1 integers starting
with the lower limit m and continuing on up to (and including) the upper limit n.
This notation is demonstrated in the following:
1) [| jes x, = x3x4x5%6x7 = TTj-3 x,, Since there is nothing special about the letter 7;
2) | [8_-,i =3-4-5-6
= 6!/2!:
240 Chapter 4 Properties of the Integers: Mathematical Induction
3) []j-, i = mn + 10m +. 2)--- (2 — 1)() = nt/(n — 1)!, for all m,n € Zt with
m <ny;and
4) |] j27 x1 = xpxgxoxioxn = [Geo x74; = [ [feo xu-y.
Ifm,n€ Zt, letm = p\' py --- py andn = pi pe ..- p!*, with each p; prime and0 < e,
EXAMPLE 4.45 | and 0 < f; for all 1 <i <+t. Thenif a; = minf{e,, f;}, the minimum (or smaller) of e; and
fj, and b; = max{e;, f,}, the maximum (or larger) of e; and f;, for all 1 <i < t, we have
t
ged(m,
n) = p™ p?..- pw = [| pi" and lem(m,
n) = pit py ++ pr = T] p?.
a; b,
f
For example, let m = 491,891,400 = 2°3°577711'13? and let n = 1,138,845,708 =
273°7'11713°17'. Then withp; = 2, p2 = 3, p3 =5, pa =7, ps = 11, po = 13, and
p7 = 17, we find a; = 2, a2 = 2, a3 = 0 (the exponent of 5 in the prime factorization of n
must be 0, because 5 does not appear in the prime factorization), a4 = 1, a5 = 1, a6 = 2,
and a7 = 0. So
gcd(m, n) = 27375°7'11'13717° = 468,468.
We also have
lem(m, n) = 2°3°577711713°17! = 1,195,787,993,400.
Our final result for this section ties together the Fundamental Theorem of Arithmetic
with the fact that any two consecutive integers are relatively prime (as observed in Exercise
21 for Section 4.4),
Here we seek an answer to the following question. Can we find three consecutive posi-
EXAMPLE 4.46
tive integers whose product is a perfect square — that is, do there exist m,n € Z* with
m(m + 1)(m + 2) =n?
Suppose that such positive integers m, n do exist. We recall that gcd(m, m+ 1) = 1=
gcd(m + 1, m + 2), so for any prime p, if p|(m + 1), then p m and p {(m + 2). Further-
more, if p|(m + 1), it follows that p|n?. And since n° is a perfect square, by the Fundamental
Theorem of Arithmetic, we find that the exponents on p in the prime factorizations of both
m +1 and n* must be the same even integer. This is true for each prime divisor of m + 1,
som + 1 is a perfect square. With n? and m + 1 both being perfect squares, we conclude
that the product m(m + 2) is also a perfect square. However, the product m(m + 2) is such
that m? < m? + 2m = m(m +2) <m? +2m +1 = (m+ 1)’. Consequently, we find that
m(m + 2) is wedged between two consecutive perfect squares — and is not equal to either
of them. So m(m + 2) cannot be a perfect square, and there are no three consecutive positive
integers whose product is a perfect square.
3. Let r€ Z* and pj, po, p3...-.. p; be distinct primes. If
LD <7 bas prime factorization p@ pp - - ps, what is the
prime factorization of (a) m?? (b) m3?
1. Write each of the following integers as a product of primes
Ay) RD Ne 4. Verify Lemma 4.3.
Py’ pr ++: p,, where O <n, forall
1 <i<k
d 5. Prove that ,/p is irrational for any prime p.
an vt Dy,
ame PLS Pa Pk 6. The change machine at Cheryll’s laundromat contains
a) 148,500 b) 7,114,800 c) 7,882,875 n quarters, 2n nickels, and 4n dimes, where n € Z~. Find
2. Determine the greatest common divisor and the least com- all values of n so that these coins total & dollars, where
mon multiple for each pair of integers in the preceding exercise. keZ.
45 The Fundamental Theorem of Arithmetic 241
7. Find the number of positive divisors for each integer in a) {4, 8, 16, 32}? b) {4, 8, 16, 32, 64}?
Exercise 1. c) (4, 8, 9, 16, 27, 32, 64, 81, 243}?
8. a) How many positive divisors are there for d) {4, 8, 9, 16, 25, 27, 32, 64, 81, 125, 243, 625, 729,
n= 29375879119 13°37? 3125}?
b) For the divisors in part (a), how many are OP PP PP. PPG. grr 3 r’, rh,
i) divisible by 2°3457 117377? where p. g, and r are distinct primes?
ii) divisible by 1,166,400,000? 20. Write a computer program (or develop an algorithm) to find
iii) perfect squares? the prime factorization of an integer n > 1.
iv) perfect squares that are divisible by 273457117?
21. In triangle ABC the length of side BC is 293. If the length
Vv) perfect cubes?
of side AB is a perfect square, the length of side AC is a power
vi) perfect cubes that are multiples of
of 2, and the length of side AC is twice the length of side AB,
2193°5?7° 117132377?
determine the perimeter of the triangle.
vii) perfect squares and perfect cubes?
22. Express each of the following in simplest form.
9, Letm,n € Z* with mn = 24345°7!11°13'. Hflem(m, n) =
10
273°577!11713!, what is gcd(m, n)?
10. Extend the results in Example 4.45 and find the greatest
a) | ](-1)
7=]
common divisor and least common multiple for the three inte- 2n+1
gers in Exercise 1. b) |] (—1)', wheren € Z*
r=]
11. How many positive integers n divide 100137n+ 8
G+
. ‘
1)G +2)
248396544?
°) I G — DW)
12. Let a € Z~. Find the smallest value of a for which 2a is a 2n
perfect square and 3a is a perfect cube. d) I] Spray Where eZ
13. a) Let a € Z*. Prove or disprove: (i) If 10|a*, then 10]a;
and (ii) If 4|a, then 4|a. 23. a) Let n = 88,200. In how many ways can one factor n as
ab where 1 <a <n,1<b<n,and ged(a, b) = 1. (Note:
b) Generalize the true result(s) in part (a).
Here order is not relevant. So, for example, a = 8, b =
14, Let a, b,c € {0, 1, 2,..., 9} with at least one of a, b,c 11,025, anda = 11,025, b = 8 result in the same unordered
nonzero. Prove that the six-digit integer abcabc is divisible by factorization.)
at least three distinct primes. b) Answer part (a) for 2 = 970,200.
15. Determine the smallest perfect square that is divisible by 7! c) Generalize the results in parts (a) and (b).
16. For all n € Z*, prove that n is a perfect square if and only 24. Use the Pi-notation to write each of the following.
if # has an odd number of positive divisors.
a) (17 + 1)(2? 4+ 2)(3* + 3)(4 + 4)(5? +5)
17. Find the smallest positive integer n for which the product
1260 X n is a perfect cube. b) (l+x)Q 4x70 4+.4)0 +24) 4+.x°)
18. Two hundred coins numbered 1 to 200 are put in a row ec) d+ x04 0°) +2°)0 +47)04 x90 4x!)
across the top of a cafeteria table. Two hundred students are 25. Prove that ifn € Z* and n > 2, then
assigned numbers (from 1 to 200) and are asked to turn over - 1 n+l
certain coins. The student assigned number !| is supposed to turn
1-2 i2 2n
over all the coins. The student assigned number 2 is supposed to
26. When does a positive integer n have exactly
turn over every other coin, starting with the second coin. In gen-
eral, the student assigned the number n, for each 1 < n < 200, a) two positive divisors? — b) three positive divisors?
is supposed to turn over every nth coin, starting with the nth c) four positive divisors? —_d) five positive divisors?
coin.
27. Let ne Z*. We say that n is a perfect integer if 2n
a) How many times will the 200th coin be turned over? equals the sum of all the positive divisors of n. For example,
b) Will any other coin(s) be turned over as many times as since 2(6) = 12 = 1+42+43-+4 6, it follows that 6 is a perfect
the 200th coin? integer.
c) Will any coin be turned over more times than the 200th a) Verify that 28 and 496 are perfect integers.
coin? b) Ifm € Z* and 2” — {is prime, prove that 2”~'(2” — 1)
19. How many different products can one obtain by multiplying is a perfect integer. [You may find the result from part (a)
any two (distinct) integers in the set of Exercise 2 for Section 4.1 useful here.]
242 Chapter 4 Properties of the Integers: Mathematical Induction
4.6
Summary and Historical Review
According to the Prussian mathematician Leopold Kronecker (1823-1891), “God made the
integers, all the rest is the work of man... . All results of the profoundest mathematical
investigation must ultimately be expressible in the simple form of properties of the integers.”
In the spirit of this quotation, we find in this chapter how the handiwork of the Almighty
has been further developed by men and women over the last 24 centuries.
Starting in the fourth century B.C. we find in Euclid’s Elements not only the geometry of
our high school experience but also the fundamental ideas of number theory. Propositions
1 and 2 of Euclid’s Book VII include an example of an algorithm to determine the greatest
common divisor of two positive integers by using an efficient technique to solve, in a finite
number of steps, a specific typeof problem.
The term algorithm, like its predecessor algorism, was unknown to Euclid. In fact, this
term did not enter the vocabulary of most people until the late 1950s when the computer
revolution began to make its impact on society. The word comes from the name of the
famous Islamic mathematician, astronomer, and textbook writer Abu Ja’far Mohammed
ibn Musa al-Khowarizmi (c. 780-850). The last part of his name, al-Khowarizmi, which is
translated as “a man from the town of Khowarizm,” gave rise to the term algorism. The word
algebra comes from al-jabr, which is contained in the title of al-Khowarizmi’s textbook
Kitab al-jabr w’al muquabaia. Translated into Latin during the thirteenth century, this book
had a profound impact on the mathematics developed during the European Renaissance.
“ge
Euclid (c. 400 B.c.) Al-Khowarizmi (c. 780-850)
As mentioned in Section 4.4, our use of the word algorithm connotes a precise step-by-
step method for solving a problem in a finite number of steps. The first person credited with
developing the concept of a computer algorithm was Augusta Ada Byron (1815-1852),
the Countess of Lovelace. The only child of the famous poet Lord Byron and Annabella
Millbanke, Augusta Ada was raised by a mother who encouraged her intellectual talents.
Trained in mathematics by the likes of Augustus DeMorgan (1806-1871), she continued
her studies by assisting the gifted English mathematician Charles Babbage (1792-1871) in
the development of his design for an early computing machine — the ‘Analytical Engine.”
4.6 Summary and Historical Review 243
The most complete accounts of this machine are found in her writings, wherein one finds
a great deal of literary talent along with the essence of the modern computer algorithm.
Further details on the work of Charles Babbage and Augusta Ada Byron Lovelace can be
found in Chapter 2 of the work by S. Augarten [1].
Augusta Ada Byron, Countess of Lovelace (1815-1852)
In the century following Euclid, we find some number theory in the work of Eratosthenes.
However, it was not until five centuries later that the first major new accomplishments in the
field were made by Diophantus of Alexandria. In his work Arithmetica, his integer solutions
of linear (and higher-order) equations stood as a mathematical beacon in number theory
until the French mathematician Pierre de Fermat (1601—1665) came on the scene.
The problem we stated in Theorem 4.8 was investigated by Diophantus and further
analyzed during the seventh century by Hindu mathematicians, but it was not actually
solved completely until the 1860s, by Henry John Stephen Smith (1826-1883).
For more on some of these mathematicians and others who have worked in the theory
of numbers, consult L. Dickson [4]. Chapter 5 in I. Niven, H. S. Zuckerman, and H. L.
Montgomery [10] deals with the solutions of Diophantine equations and their applications.
In the work Formulario Matematico, published in 1889, Giuseppe Peano (1858-1932)
formulated the set of nonnegative integers on the basis of three undefined terms: zero,
number, and successor. His formulation is as follows:
a) Zero is a number.
b) For each number n, its successor is a number.
c) No number has zero as its successor.
d) If two numbers m, n have the same successor, then m = n.
e) If T is a set of numbers where 0 € 7, and where the successor of n is in 7 whenever
nisin 7, then T is the set of all numbers.
In these postulates the notion of order (successor) and the technique called mathematical
induction are seen to be intimately related to the idea of number (that is, nonnegative
integer). Peano attributed the formulation to Richard Dedekind (1831-1916), who was the
first to develop these ideas; nonetheless, these postulates are generally known as ‘“‘Peano’s
postulates.”
244 Chapter 4 Properties of the Integers: Mathematical Induction
The first European to apply the Principle of Mathematical Induction in proofs was the
Venetian scientist Francisco Maurocylus (1491-1575). His book, Arithmeticorum Libri
Duo (published in 1575), contains a proof, by mathematical induction, that the sum of
the first n positive odd integers is n”. In the next century, Pierre de Fermat made further
improvements on the technique in his work involving “the method of infinite descent.”
Blaise Pascal (c. 1653), in proving such combinatorial results as C(n, k)/C(n,k +1) =
(k + 1)/(n —k),0<k <n — 1, used induction and referred to the technique as the work
of Maurocylus. The actual term mathematical induction was not used, however, until the
nineteenth century when it appeared in the work of Augustus DeMorgan (1806-1871). In
1838 he described the process with great care and gave it the name mathematical induction.
(An interesting survey on this topic is found in the article by W. H. Bussey [2].)
The text by B. K. Youse [13] illustrates many varied applications of the Principle of
Mathematical Induction in algebra, geometry, and tri gonometry. For more on the relevance
of this method of proof to the problems of programming and the development of algorithms,
the text by M. Wand [12] (especially Chapter 2) provides ample background and examples.
More on the theory of numbers can be found in the texts by G. H. Hardy and E. M.
Wright [5], W. J. LeVeque [7, 8], and I. Niven, H. S. Zuckerman, and H. L. Montgomery
[10]. Ata level comparable to that of this chapter, Chapter 3 of V. H. Larney [6] provides an
enjoyable introduction to this material. The text by K. H. Rosen [11] integrates applications
in cryptography and computer science in its development of the subject. The journal article
by M. J. Collison [3] examines the history of the Fundamental Theorem of Arithmetic. The
articles in [9] recount some interesting developments in number theory.
REFERENCES
. Augarten, Stan. BIT by BIT, An Illustrated History of Computers. New York: Ticknor & Fields,
—
1984.
2. Bussey, W. H. “Origins of Mathematical Induction.” American Mathematical Monthly 24
(1917): pp. 199-207.
3. Collison, Mary Joan. “The Unique Factorization Theorem: From Euclid to Gauss.” Mathe-
matics Magazine 53 (1980): pp. 96-100.
4, Dickson, L. History of the Theory of Numbers. Washington, D.C.: Carnegie Institution of
Washington, 1919. Reprinted by Chelsea, in New York, in 1950.
5. Hardy, Godfrey Harold, and Wright, Edward Maitland. An Introduction to the Theory of Num-
bers, 5th ed. Oxford: Oxford University Press, 1979.
6. Larney, Violet Hachmeister. Abstract Algebra: A First Course. Boston: Prindle, Weber &
Schmidt, 1975,
7. LeVeque, William J. Elementary Theory of Numbers, Reading, Mass.: Addison-Wesley, 1962.
. LeVeque, William J. Topics in Number Theory, Vols. land Il. Reading, Mass.: Addison-Wesley,
oO
1956.
9. LeVeque, William J., ed. Studies in Number Theory. MAA Studies in Mathematics, Vol. 6.
Englewood Cliffs, N.J.: Prentice-Hall, 1969. Published by the Mathematical Association of
America.
10. Niven, Ivan, Zuckerman, Herbert S., and Montgomery, Hugh L. An Introduction to the Theory
of Numbers, 5th ed. New York: Wiley, 1991.
11. Rosen, Kenneth H. Elementary Number Theory, 4th ed. Reading, Mass.: Addison-Wesley,
2000.
12, Wand, Mitchell. Induction, Recursion, and Programming. New York: Elsevier North Holland,
1980.
13. Youse, Bevan K. Mathematical Induction. Englewood Cliffs, N.J.: Prentice-Hall, 1964.
Supplementary Exercises 245
8. Letn € Z* where n is odd and » is not divisible by 5. Prove
SUPPLEMENTARY EXERCISES that there is a power of n whose units digit is 1.
9. Find the digits x, y, z where (xyz)o = (zyx)o.
1. Let a, d be fixed integers. Determine a summation for- 10. If ȢZ*, how many possible values are there for
mula for a + (a@+d)+ (a+ 2d)+---+ (@4+ (n— 1)d), for gcd(n, n + 3000)?
néZ*. Verify your result by mathematical induction. 11. Ifn € Z* and n > 2, prove that 2” < 7") < 4".
2. Inthe following pseudocode program segment the variables 12. Ifn € Z*, prove that 57 divides 7”? + 87+,
n and sum are integer variables. Following the execution of this 13. For all n € Z*, show that if n > 64, then n can be written
program segment, which value of n is printed? as a sum of 5’s and/or 17’s.
n:=3 14. Determine all a, b € Z such that s+ 4 = a:
Sum :=0
15. Given re Zt, write r=ro try: lOtr-1P 4+---4
while sum < 10,000 do
r, - 10", whereO <r, <9for0<i<n—Il,andO<r, <9.
begin
Ni=n+7 a) Prove that 9|r if and only if 9|(7, +r,-) +--+ +r +
sum := sum+n ry +7).
end b) Prove that 3|r if and only if 3[(7, +7,-1) +++ ++
print n r) + Fo).
3. Consider the following five equations. c) Ift = 137486x225, where x is a single digit, determine
the value(s) of x such that 3|f. Which values of x make t
1) 1=1 divisible by 9?
2) 1—4= —(1+2) 16. Frances spends $6.20 on candy for prizes in a contest. If a
3) 1~-44+9=1+2+43 10-ounce box of this candy costs $.50 and a 3-ounce box costs
4) 1—44+9-16=-(14+24+3+44) $.20, how many boxes of each size did she purchase?
5) 1—4+9- 164+25=1424+344+45 17. a) How many positive integers can we express as a product
of nine primes (repetitions allowed and order not relevant)
Conjecture the general formula suggested by these five equa- where the primes may be chosen from {2, 3, 5, 7, 11}?
tions, and prove your conjecture.
b) How many of the positive integers in part (a) have at
4. For n € Z*, prove each of the following by mathematical least one occurrence of each of the five primes?
induction:
18. Find the product of all (positive) divisors of (a) 1000;
a) 5|(n°? —n) b) 6|(n3 + 5n) (b) 5000; (c) 7000; (d) 9000; (e) p”g", where p, q are dis-
§. Foralln € Z*, let S(n) be the open statement: n? + n + 41 tinct primes and m, n € Zt; and (f) p"q"r*, where p, g, r are
is prime. distinct primes and m,n, k € Z*.
a) Verify that S(n) is true forall 1 <n <9. 19. a) Ten students enter a locker room that contains 10 lock-
b) Does the truth of S(&k) imply that of S(k + 1) for all ers. The first student opens all the lockers. The second stu-
keZ*? dent changes the status (from closed to open, or vice versa)
of every other locker, starting with the second locker. The
6. For n € Z* define the sum s, by the formula
third student then changes the status of every third locker,
ee 4, aah n
Sy = — — — ee . starting at the third locker. In general, for 1 < k < 10, the
2! 3! 4! n} (n+ 1)! kth student changes the status of every Ath locker, starting
a) Verify that s; = $ = 2, and s3 = a. with the kth locker. After the tenth student has gone through
the lockers, which lockers are left open?
b) Compute s4, 55, and S¢.
b) Answer part (a) if 10 is replaced by n € Z*, n > 2.
c) On the basis of your results in parts (a) and (b), conjec-
ture a formula for the sum of the terms in s,. 20. Let A = {a1, a2, a3, a4, as} C Z*. Prove that A contains a
nonempty subset S where the sum of the elements in S is a mul-
d) Verify your conjecture in part (c) for all n € Z* by the
tiple of 5. (Here it is possible to have a sum consisting of only
Principle of Mathematical Induction.
one summand.)
7. For alln € Z, n > O, prove that
21. Consider the set {1, 2, 3}. Here we may write {1, 2, 3} =
a) 2°"+! 4 ] is divisible by 3. {1,2}U {3}, where 1+2=3. For the set {1, 2, 3,4} we
b) n° + (n + 1)3 + (n + 2)? is divisible by 9. find that {1, 2, 3, 4} = {1, 4} U {2, 3}, where 14+ 4=2+3.
246 Chapter 4 Properties of the Integers: Mathematical Induction
However, things change when we examine the set 31. Leta € Z* with u the units digit of n. Prove that 7|n if and
{1, 2, 3, 4, 5}. In this case, if C C {1, 2,3, 4, 5} and we let only if 7|(44* — 24).
Sc denote the sum of the elements in C, then we find that there 32. Let m,n € Zt with 19m + 90+ 8” = 1998. Determine
is no way to write {1, 2, 3, 4,5} = AU B, with AN B = @and m, n so that (a) 1 is minimal; (b) m is minimal.
Sa = Sp.
33. Catrina selects three integers from {0, 1, 2, 3, 4, 5, 6, 7, 8,
a) For which n € Z*,n > 3, can we write {1, 2,3,..., 9} and then forms the six possible three-digit integers (leading
n}= AUB, with AN B =@ and s, = Sg? (As above, s, zero allowed) they determine. For instance, for the selection 1,
and sz denote the sums of the elements in A and B, respec- 3, and 7, she would form the integers 137, 173, 317, 371, 713,
tively.) and 731]. Prove that no matter which three integers she initially
b) Let n € Z* with n > 3. If we can write {1, 2,3,..., selects, it is not possible for all six of the resulting three-digit
n} = AUB with AN B= 6 and sy, = sg, describe how integers to be prime.
such sets A and B can be determined. 34. Consider the three-row and four-column table shown in
Fn+l
22. Determine those integers n for which uns and 4 are Fig. 4.12. Show that it is possible to place eight of the nine in-
also integers. tegers 2, 3, 4,7, 10, 11, 12, 13, 15 in the remaining eight cells
23. Leta, be Z*. of the table so that the average of the integers in each row is the
same integer and the average of the integers in each column is
a) Prove that if a*|b? then ab.
the same integer. Specify which of the nine integers given can-
b) Is it true that if a7|b? then a|b? not be used and show how the other eight integers are placed in
24. Let n be a fixed positive integer that satisfies the property: the table.
For all a, b € Z", if nlab then nla or n|b. Prove that n = 1 or
n 1s prime.
14
25. Suppose that a, b, k € Z* and that k is not a power of 2.
a) Prove that if a* + b* ¥ 2, then a* + b* is composite.
b) Ifn € Z* and n is not a power of 2, prove that if 2” + }
1
is prime, then 7 is prime.
For the next three exercises, recall that H,, F,,, and L, denote
Figure 4.12
the mth harmonic, Fibonacci, and Lucas numbers, respectively.
35. Allen writes the consecutive integers 1, 2,3,...,n ona
26. Prove that for alln EN, Ho <1-+7n.
blackboard. Then Barbara erases one of these integers. If the
27. Prove that F, < (5/3)” for alla EN. average of the remaining integers is 354, what is n and what
28. For n €N, prove that integer was erased?
36. Leslie selects a random integer between | and 100 (inclu-
Lo thy + Late tly = OL, = Lage).
sive). Find the probability her selection is divisible by (a) 2 or
:=0
3; (b) 2, 3, or 5.
29. a) For the five-digit integers (from 10000 to 99999) how
37. Let m = pj! py p;'p;* and n = pl pP pi ps, where pr,
many are palindromes and what is their sum?
P2;, P3, P4s P5 are distinct primes, and €}, €2, €3, €4; fis to, th,
b) Write a computer program to check the answer for the fs € Z*. How many common divisors are there for m, n?
sum in part (a).
30. Let a,b be odd with a>b. Prove that gced(a, b) =
ged (45%, b).
Relations and
Functions
t this chapter we extend the set theory of Chapter 3 to include the concepts of relation
and function. Algebra, trigonometry, and calculus all involve functions. Here, however,
we shall study functions from a set-theoretic approach that includes finite functions, and
we shall introduce some new counting ideas in the study. Furthermore, we shall examine
the concept of function complexity and its role in the study of the analysis of algorithms.
We take a path along which we shall find the answers to the following (closely related)
six problems:
1) The Defense Department has seven different contracts that deal with a high-security
project. Four companies can manufacture the distinct parts called for in each contract,
and in order to maximize the security of the overall project, it is best to have all four
companies working on some part. In how many ways can the contracts be awarded
so that every company is involved?
2) How many seven-symbol quaternary (0, 1, 2, 3) sequences have at least one occur-
rence of each of the symbols 0, 1, 2, and 3?
3) An m X n zero-one matrix is a matrix A with m rows and n columns, such that in
row i, forall 1 <i <m, and column j, for all 1 < j <n, the entry a,; that appears is
either 0 or 1. How many 7 X 4 zero-one matrices have exactly one 1 in each row and
at least one 1 in each column? (The zero-one matrix is a data structure that arises in
computer science. We shall learn more about it in later chapters.)
4) Seven (unrelated) people enter the lobby of a building which has four additional
floors, and they all get on an elevator. What is the probability that the elevator must
stop at every floor in order to let passengers off?
5) For positive integers m, n with m <n, prove that
Yew, "i (n—k)" =0.
k=0
6) For every positive integer n, verify that
i
n! ;— » | _1\k
1) (, -n i) _
k)".n
Do you recognize the connection among the first four problems? The first three are the
same problem in different settings. However, it is not obvious that the last two problems
are related or that there is a connection between them and the first four. These identities,
however, will be established using the same counting technique that we develop to solve
the first four problems.
247
248 Chapter 5 Relations and Functions
5.1
Cartesian Products and Relations
We start with an idea that was introduced earlier in Definition 3.11. However, we repeat the
definition now in order to make the presentation here independent of this prior encounter.
Definition 5.1 For sets A, B the Cartesian product, or cross product, of A and B is denoted by A X B and
equals {(a, b)|a € A, b € B}.
We say that the elements ofA X B are ordered pairs. For (a, b), (c,d) € A X B, we
have (a, b) = (c, d) if and only ifa = c and b = d.
If A, B are finite, it follows from the rule of product that |A x B| = |A|-|B|. Although
we generally will not have A X B = B X A, we willhave |A X B| = |B X Al.
Here A CU; and B C Ur, and we may find that the universes are different
— that is,
U, ~ Uy. Also, even if A, B CU, it is not necessary that A X B CU, so unlike the cases
for union and intersection, here (AL) is not necessarily closed under this binary operation.
We can extend the definition of the Cartesian product, or cross product, to more than two
sets. Letn € Z*, n > 3. For sets Aj, Az, ..., An, the (n-fold) product of Ay, Az,..., Ay
is denoted by A, X Az X--- X A, and equals {(a|, a2, ..., dn) la; € A,, 1 <i <7n}.” The
elements of Ay X Az X +--+ X A, are called ordered n-tuples, although we generally use the
term triple in place of 3-tuple. As with ordered pairs, if (a|, a2,..., Gn), (Bb), bo, ..., bn) E
A, X Ap X--++ X A,, then (a1, d2,..., G,) = (b), bo, ..., by) if and only if a, = 5; for
all 1 <i <a.
EXAMPLE 5.1
Let A = {2, 3, 4}, B = {4, 5}. Then
a) AX B = {(2, 4), (2,5), (3, 4), G, 5), (4 4), (4, 5)}.
b) BX A = {(4, 2), (4, 3), (4, 4), G, 2), 6, 3), (5, 4}.
c) B= BX B= {(4, 4), (4,5), (5, 4, (5, 5)}.
d) B’=BX BX B= {(a, b, c)\a, b, c € B}; for instance, (4,5, 5) € B?.
The set R X R = {(x, y)|x, y € R} is recognized as the real plane of coordinate geometry
EXAMPLE 5.2
and two-dimensional calculus. The subset R* X R®* is the interior of the first quadrant
of this plane. Likewise R* represents Euclidean three-space, where the three-dimensional
interior of any sphere (of positive radius), two-dimensional planes, and one-dimensional
lines are subsets of importance.
Once again let A = {2, 3, 4} and B = {4, 5}, as in Example 5.1, and let C = {x, y}. The
EXAMPLE 5.3
construction of the Cartesian product A X B can be represented pictorially with the aid of
a tree diagram, as in part (a) of Fig. 5.1. This diagram proceeds from left to right. From
"When dealing with the Cartesian product of three or more sets, we must be careful about the lack
of associativity. In the case of three sets, for example, there is a difference between any two of the sets
A; X Az X A3, (Al X Az) X Az, and A X (Aa X A3) because their respective elements are ordered triples
(4. 42, a3), and the distinct ordered pairs ((a), a2). a3) and (a,, (a2, a3)). Although such differences are im-
portant in certain instances, we shall not concentrate on them here and shall always use the nonparenthesized form
A X A2 X Aj. This will also be our convention when dealing with the Cartesian product of four or more sets.
5.1 Cartesian Products and Relations 249
the left-most endpoint, three branches originate — one for each of the elements of A. Then
from each point, labeled 2, 3, 4, two branches emanate
— one for each of the elements 4,
5 of B. The six ordered pairs at the right endpoints constitute the elements (ordered pairs)
of A X B. Part (b) of the figure provides a tree diagram to demonstrate the construction of
B X A. Finally, the tree diagram in Fig. 5.1 (c) shows us how to envision the construction
ofA X B X C, and demonstrates that |A X B X C]) =12=3*2X2=|A||BI|Cl.
(4, 2)
(2, 4) 4 (4, 3)
(2, 5)
(3, 4) (4, 4}
(5, 2)
(4, 4)
(4, 5) 5 (5, 3)
(5, 4)
(a) AxB | (b) BX A
(2, 4, x)
—" 4) ox @ 4.)
(2, 5, x)
—< \—< O84)
4) o—— (3, 4, x)
se 35.0
(3,5, x
(3, —_?
5) 3.5.)
(4, 4, x)
<<" ay
, 5D, X)
(4, Je Sy)
(c) AXBxc
Figure 5.1
In addition to their tie-in with Cartesian products, tree diagrams also arise in other
situations.
At the Wimbledon Tennis Championships, women play at most three sets in a match. The
EXAMPLE 5.4
winner is the first to win two sets. If we let N and E denote the two players, the tree diagram in
Fig. 5.2 indicates the six ways in which this match can be won. For example, the starred line
segment (edge) indicates that player E won the first set. The double-starred edge indicates
that player N has won the match by winning the first and third sets.
250 Chapter 5 Relations and Functions
First set Second set Third set
(when needed)
Figure 5.2
Tree diagrams are examples of a general structure called a tree. Trees and graphs are
important structures that arise in computer science and optimization theory. These will be
investigated in later chapters.
For the cross product of two sets, we find the subsets of this structure of great interest.
Definition 5.2 For sets A, B, any subset of A X B is called a (binary) relation from A to B. Any subset
of A X A is called a (binary) relation on A.
Since we will primarily deal with binary relations, for us the word “relation” will mean
binary relation, unless something otherwise is specified.
With A, B as in Example 5.1, the following are some of the relations from A to B.
EXAMPLE 5.5
a) 2 b) {(2, 4)}
ce) {(2, 4), 2, 5)} d) (2,4), GB, 4, 4,0}
e) {(2, 4), (3, 4), (4, 5)} f)AXB
Since |A X B| = 6, it follows from Definition 5.2 that there are 2° possible relations
from A to B (for there are 2° possible subsets of A X B).
For finite sets A, B with |A] = m and |B} = n, there are 2” relations from A to B,
including the empty relation as well as the relation A X B itself.
There are also 2°” (= 2") relations from B to A, one of which is also @ and another
of which is B X A. The reason we get the same number of relations from B to A as we
have from A to B is that any relation ®; from B to A can be obtained from a unique
relation 2 from A to B by simply reversing the components of each ordered pair in
Ry (and vice versa).
| EXAMPLE5.6 | For B = {1, 2}, let A = P(B) = {@, {1}, {2}, {1, 2}}. The following is an example of a
relation on A: R = {(B, B), (, {1}), GY, {2}), @, C1, 23), Ca, (1D, A, 1, 2),
({2}, {2}), ({2}, (1, 2}), (1, 2}, (1, 2})}.
We can say that the relation & is the subset relation
where (C, D) € Rif and only ifC, DC BandC CD.
5.1 Cartesian Products and Relations 251
EXAMPLE 5.7 With A = Zt, we may define a relation & on set A as {(x, y)|x < y}. This is the familiar
. “is less than or equal to” relation for the set of positive integers. It can be represented
graphically as the set of points, with positive integer components, located on or above the
line y = x in the Euclidean plane, as partially shown in Fig. 5.3. Here we cannot list the
entire relation as we did in Example 5.6, but we note, for example, that (7, 7), (7, 11) eR,
but (8, 2) ¢ KR. The fact that (7, 11) € R can also be denoted by 7 R 11; (8, 2) ¢ R becomes
8 Fi 2. Here 7 R11 and 8 F 2 are examples of the infix notation for a relation.
y 4
4
3
2
1
J) | 2 3 4
Figure 5.3
Our last example helps us to review the idea of a recursively defined set.
EXAMPLE 5.8 ] Let R be the subset of N X N where & = {(m, n)|n = 7m}. Consequently, among the
. ordered pairs in R one finds (0, 0), (1, 7), (11, 77), and (15, 105). This relation & on N
can also be given recursively by
1) (0,0) € R; and
2) If (s,t) ER, then(s +1,f+ 7) ER.
We use the recursive definition to show that the ordered pair (3, 21) (from N X N) is in
R. Our derivation is as follows: From part (1) of the recursive definition we start with
(0, 0) € R. Then part (2) of the definition gives us
) 0,0)ER>SO+1,04+7 =, 7 ER;
ii) ,7)€R314+1,74+7) = (2, 14) € R; and
iii) (22,1) eR (241, 1447) = 3, 2 ER.
We close this section with these final observations.
1) ForanysetA,AX@=8.(fA X@#@,let(a,b)€ AX W.Thenaé€ Aandbe &.
Impossible!) Likewise, 0 X A = @.
2) The Cartesian product and the binary operations of union and intersection are inter-
related in the following theorem.
THEOREM 5.1 For any sets A, B, C CU:
a) AX (BNC) =(AX B)N(AXC)
b) AX (BUC) =(AX B)U(AXC)
252 Chapter 5 Relations and Functions
c) (ANB) XC =(AXC)N(BXC)
d) (AUB) XC =(AXC)U(BXC)
Proof: We prove part (a) and leave the other parts for the reader. We use the same concept of
set equality (as in Definition 3.2 of Section 3.1) even though the elements here are ordered
pairs. For all a, bE U, (a;b)E AX (BNC) SaeA and bE BNC >a €EA and
be B,CeacdA, be BandacA, beCe(a,b)€ AX Band(a, bbe AxCs
(a,b)Ee(AX B)N(A XC).
7, a) If A = {1, 2,3, 4,5} and B = {w, x, y, z}, how many
34a teh SER elements are there in P(A X B)?
1. IfA = {1, 2,3, 4}, B = {2, 5}, and
C = {3, 4, 7}, b) Generalize the result in part (a).
determine
A X B; BX A; AU(B XC); (AUB) XC; 8. Logic chips are taken from a container, tested individually,
(AX C)U(B XC). and labeled defective or good. The testing process is continued
until either two defective chips are found or five chips are tested
2. If A= {1, 2,3}, and B= {2, 4,5}, give examples of in total. Using a tree diagram, exhibit a sample space for this
(a) three nonempty relations from A to B; (b) three nonempty process.
relations on A.
9. Complete the proof of Theorem 5.1.
3. For A, B as in Exercise 2, determine the following:
10. A rumor is spread as follows. The originator calls two peo-
(a) |A X BI; (b) the number of relations from A to B; (c) the ple. Each of these people phones three friends, each of whom in
number of relations on A; (d) the number of relations from A turn calls five associates. If no one receives more than one call,
to B that contain (1, 2) and (1, 5); (e) the number of relations and no one calls the originator, how many people now know the
from A to B that contain exactly five ordered pairs; and (f) the rumor? How many phone calls were made?
number of relations on A that contain at least seven elements.
11. For A, B, C CU, prove that
4. For which sets A, B is it true that A X B = B X A?
AX (B—C)=(AX B)—-(AXC).
5. Let A, B, C, D be nonempty sets. 12. Let A, B be sets with |B| = 3. If there are 4096 relations
a) Prove that A X B CC X D if and only if ACC and from A to B, what is |A|?
BCD. 13. Let RON XN where (m,n) € RK if (and only if) n=
b) What happens to the result in part (a) if any of the sets 5m + 2. (a) Give a recursive definition for ®. (b) Use the
A, B, C, Dis empty? recursive definition from part (a) to show that (4, 22) € &R.
14. a) Give a recursive definition for the relation ARC
6. The men’s final at Wimbledon is won by the first player to
Z* X Z* where (m, n) € KR if (and only if) m >n.
win three sets of the five-set match. Let C and M denote the
players. Draw a tree diagram to show all the ways in which the b) From the definition in part (a) verify that (5, 2) and
match can be decided. (4, 4) are in KR.
5.2
Functions: Plain and One-to-One
In this section we concentrate on a special kind of relation called a function. One finds
functions in many different settings throughout mathematics and computer science. As for
general relations, they will reappear in Chapter 7, where we shall examine them much more
thoroughly.
Definition 5.3 For nonempty sets A, B, a function, or mapping, f from A to B, denoted f: A— B,isa
relation from A to B in which every element of A appears exactly once as the first compo-
nent of an ordered pair in the relation.
5.2 Functions: Plain and One-to-One 253
We often write f(a) = b when (a, b) is an ordered pair in the function f. For (a, b) € f,
bis called the image of a under f, whereas a is a preimage of b. In addition, the definition
suggests that f is a method for associating with each a € A the unique element f(a) =
b € B. Consequently, (a, b), (a, c) € f implies b = c.
ForA = {1, 2, 3} andB = {w, x, y, z}, f = {(, w), (2, x), (3, x)} is a function, and con-
EXAMPLE 5.9
sequently a relation, from A to B. R, = {(1, w), (2, x)} and Ry = {(1, w), (2, w), (2, x),
(3, z)} are relations, but not functions, from A to B. (Why?)
Definition 5.4 For the function f: A — B, A 1s called the domain of f and B the codomain of f. The
subset of B consisting of those elements that appear as second components in the ordered
pairs of f is called the range of f and is also denoted by f(A) because it is the set of
images (of the elements of A) under f.
In Example 5.9, the domain off = {1, 2, 3}, the codomain off = {w, x, y, z}, and the
range off = f(A) = {w, x}.
A pictorial representation of these ideas appears in Fig. 5.4. This diagram suggests that a
may be regarded as an input that is transformed by f into the corresponding output, f(a).
In this context, a C++ compiler can be thought of as a function that transforms a source
program (the input) into its corresponding object program (the output).
A B
Figure 5.4
Many interesting functions arise in computer science.
EXAMPLE 5.10
a) A common function encountered is the greatest integer function, or floor function.
This function f/: R > Z, is given by
f(x) = Lx] = the greatest integer less than or equal to x.
Consequently, f(x) = x, if x € Z; and, when x € R — Z, f(x) is the integer to the
immediate left of x on the real number line.
For this function we find that
1) [3.8] = 3, [3] = 3, |-3.8] = —4, |-3] = —3;
2) [7.1 4+ 8.2] = [15.3] = 15=748 = [7.1] + [8.2]; and
3) (7.74 8.4] = [16.1] = 16 4 15 =7+4+8 = [7.7] + [8.4].
254 Chapter 5 Relations and Functions
b) A second function — one related to the floor function in part (a) —is the ceiling func-
tion. This function g: R > Z is defined by
g(x) = [x] = the least integer greater than or equal to x.
So g(x) = x whenx € Z, but whenx € R — Z, then g(x) is the integer to the immediate
right of x on the real number line. In dealing with the ceiling function one finds that
1) [3] = 3, [3.01] = [3.7] = 4 = [4], [-3] = -3, [-3.01] = [-3.7] = -3;
2) [3.6 +4.5] = [8.1] =9 =445 = [3.6] + [4.5]; and
3) [3.34+4.2] = [7.5] =849=445 = [3.3] 4+ [4.2].
Cc) The function trunc (for truncation) is another integer-valued function defined on R.
This function deletes the fractional part of a real number. For example, trunc(3.78)
= 3, trunc(5) = 5, trunc(—7.22) = —7. Note that trunc(3.78) = [3.78| = 3 while
trunc(—3.78) = [—3.78] = —3.
d) In storing a matrix in a one-dimensional array, many computer languages use the row
major implementation. Here, if A = (4;j)mxn iS anm X n matrix, the first row of A is
stored in locations 1, 2, 3,..., n of the array if we start with a; in location 1. The entry
a>, is then found in positionn + 1, while entry a34 occupies position 2n + 4 in the array.
In order to determine the location of an entry a;; from A, where 1 <i <m,1 <j <n,
one defines the access functionf from the entries of A to the positions 1, 2, 3, ..., mn
of the array. A formula for the access function here is f(a;,) = (i — 1)n + j.
ay\
G12 ae
‘1Gin| 4@21 422 |°''|@2nj; 431 [tc' Qij ute Ginn
nnvnt+iln+2--- 2n 2n4+1--- G@-—Daty--: (m—1)n4+n
(= mn)
We may use the floor and ceiling functions in parts (a) and (b), respectively, of Example
EXAMPLE 5.11
5.10 to restate some of the ideas we examined in Chapter 4.
a) When studying the division algorithm, we learned that for all a, b € Z, where b > 0,
it was possible to find unique g, r € Z witha = gb+ r and0 <r < b. Now we may
add thatg = | ¢ | andr =a — [ ¢ | b.
b) In Example 4.44 we found that the positive integer
29,338,848,000 = 283°5°7711
has
60 = (5)(3)(2)(2)
(5)(3)(2)(2)(1)1) =
84+)D))/S64+)D)—G+D)/64+D)])/04+)
5 || 5 5 5 5
positive divisors that are perfect squares. In general, if n € Z* with n > 1, we know
that we can write
n= pips ++: pe
wherek € Z*, p; isprime forall 1 <i <k, p; # p;foralll <i < j <k,ande; €Z*
for all 1 <i <k. This is due to the Fundamental Theorem of Arithmetic. Then if
r €Z", we find that the number of positive divisors of n that are perfect rth powers
k ep+ 1
k k : ‘
S I] | When r = | we get I] fe, +1] = I le + 1), which is the number
i=l r i=1 i=]
of positive divisors of n.
5.2 Functions: Plain and One-to-One 255
In Sections 4.1 and 4.2 we were introduced to the concept of a sequence in conjunction
EXAMPLE 5.12 | with our study of recursive definitions. We should now realize that a sequence of real
numbers r}, r2, 73, ... can be thought of as a function f: Z* > R where f(n) = rp, for all
n € Z*. Likewise, an integer sequence do, a), 42, . . .can be defined by means of a function
g:N — Z where g(n) = a,, foralln EN.
In Example 5.9 there are 2!* = 4096 relations from A to B. We have examined one
function among these relations, and now we wish to count the total number of functions
from A to B.
For the general case, let A, B be nonempty sets with |A| = m, |B| = n. Consequently,
if A = {@1, a2, 43,...,@,} and B = {b;, bo, b3,..., by}, then a typical function
f: A-» B can be described by {(a;, x1), (G2, X2), (@3, X3),- +. + (ms %m)}. We can
select any of the n elements of B for x, and then do the same for x.. (We can se-
lect any element-of B for x. so that the same element of B may be selected for both x,
and x2.) We continue this selection process until one of the n elements of B is finally
selected for x. In this way, using the rule of product, there are n™ = {B|!4! functions
from A to B.
Therefore, for A, B in Example 5.9, there are 47 = |B|'4! = 64 functions from A to B,
and 34 = |A|!4! = 81 functions from B to A. In general, we do not expect |A]!4! to equal
|B|!4!, Unlike the situation for relations, we cannot always obtain a function from B to A
by simply interchanging the components in the ordered pairs of a function from A to B (or
vice versa).
Now that we have the concept of a function as a special type of relation, we turn our
attention to a special type of function.
Definition 5.5 A function f: A — B is called one-to-one, or injective, if each element of B appears at
most once as the image of an element of A.
If f: A > B is one-to-one, with A, B finite, we must have |A| < | B|. For arbitrary sets
A, B, f: A— B 1s one-to-one if and only if for all a), a2 € A, f(ay) = f(a) > a, = a@.
Consider the function f: R > R where f(x) = 3x + 7forallx € R. Then for all.x,, x. € R,
EXAMPLE 5.13
we find that
f(x) = f(x2) > 3x) +7 = 3x2 +7
= 3x) = 3x2 > x) = XD,
so the given function f is one-to-one.
On the other hand, suppose that g: R > R is the function defined by g(x) = x* — x for
each real number x. Then
g(0)=(0)*-0=0 and g(l)=(1)*-(1)=1-1=0.
Consequently, g is not one-to-one, since g(0) = g(1) butO # 1 —thatis, g isnot one-to-one
because there exist real numbers x;, x2 where g(x,) = g(x%2) HX; = Xo.
256 Chapter 5 Relations and Functions
Let A = {1, 2, 3} and B = {1, 2, 3, 4, 5}. The function
EXAMPLE 5.14
f ={d, 1), (2, 3), 3, 4}
is a one-to-one function from A to B;
gs ={(1, 1), @, 3), GB, 3)}
is a function from A to B, but it fails to be one-to-one because g(2) = g(3) but 2 # 3.
For A, B in Example 5.14 there are 2'° relations from A to B and 5° of these are functions
from A to B. The next question we want to answer is how many functions f: A — B are
one-to-one. Again we argue for general finite sets.
With A = {aj, a2, 43, ..., Gm}, B = {by, bo, b3,..., bg}, and m <n, a one-to-one
function f: A—> B has the form ‘{(a1, x1), (a2, ¥2), (a3..x3),..., (Gm, Xm)}, where
there are n choices for x; (that is, any element of B), n — 1 choices for x2 (that is,
any element of B except the one chosen for x;), n — 2 choices for x3, and so on, finish-
ing with n — (m — 1) =n —m + 1 choices for x,,. By the rule of product, the number
of one-to-one functions from A to B is
n(n ~D(n—2)---a@—m+)= ;= Pin, m) = PBI, Af).
(n —m)
Consequently, for A, B in Example 5.14, there are 5-4-3 = 60 one-to-one functions
f: A> B.
Definition 5.6 If f: A— Band A; CA, then
f(A,) = {b € Bib = f(a), for some a € Aj},
and f(A}) is called the image of A, under f.
For A = {1, 2,3, 4,5} and B = {w, x, y, z}, let f: A— B be given by f = {(1, w),
EXAMPLE 5.15
(2, x), (3, x), (4, y), (5, y)}. Then for A; = {1}, Az = {1, 2}, A3 = {1, 2, 3}, Aq = {2, 3},
and As = {2, 3, 4, 5}, we find the following corresponding images under f:
F(A1) = {f(@la € Ai} = {f(@la € {1}} = {f(@la = 1} = {fC} = {w}:
f(A2) = {f(@la ¢ Ao} = {f(@la € {1, 2}} = {f@la = 1 or 2}
={f1), f(2)} = tw, x};
f(A3) ={f 0), f(2), FG)} = {w, x}, and f(A3) = f(A2) because f(2) = x = f(3);
f (Aq) = {x}; and f(As) = {x, y}.
a) Let g:R — R be given by g(x) = x*. Then g(R) = the range of g = [0, +00). The
EXAMPLE 5.16
image of Z under g is g(Z) = {0, 1, 4, 9, 16,...}, and for A, = [—2, 1] we get
g(A;) = [0, 4].
5.2 Functions: Plain and One-to-One 257
b) Let h: ZX Z— Z@ where h(x, y) = 2x +3y. The domain of h is Z X Z, not Z,
and the codomain is Z. We find, for example, that 4(0, 0) = 2(0) + 3(0) = 0 and
h(—3, 7) = 2(-3) + 3(7) = 15. In addition, h(2, —1) = 2(2) + 3(—1) = 1, and for
each n € Z, h(2n, —n) = 2(2n) + 3(—n) = 4n — 3n = n. Consequently, 4(Z X Z)
= the range of h = Z. For A, = {(0, n)|n € Zt} = {0} X Z* CZ XZ, the image
of A; underh is h(A;) = (3, 6,9,...} = Baln € Z*}.
Our next result deals with the interplay between the images of subsets (of the domain)
under a function f and the set operations of union and intersection.
THEOREM 5.2 Let f: A— B, with A;, Ap C A. Then
a) f(A; U Ao) = f(A1) U fF (A2); b) f(A A2) © f(A) 9 f{A2);
¢) f(A, 9 A2) = f(A1) MN f(A2) when f is one-to-one.
Proof: We prove part (b) and leave the remaining parts for the reader.
For each be B, be f(A; NA2) > b= fla), for some ae A; N Ar> [b= fla)
for some a € A;] and [b= f(a) for some aé€ Ap] > be f(A) and be f{A2) >
be f(A1)O f(A2), so f(A1 1 Az) © f{Ay) A f{A2).
Definition 5.7 If f: A— B and A; CA, then f|,4,: A} > B is called the restriction of f to A, if
fla,(a@) = f(a) for alla € A,.
Definition 5.8 Let A; C A and f: A, > B.If g: A— B and g(a) = f(a) for all a € Aj, then we call g
an extension of f to A.
ForA = {1, 2, 3, 4, 5}, let f: A > R be defined by f = {(1, 10), (2, 13), (3, 16), (4, 19),
EXAMPLE 5.17
(5, 22)}. Let g:Q— R where g(g) = 3g +7 for all g €Q. Finally, let kh: R > R with
h(r) = 3r + 7 for allr € R. Then
i) g is an extension of f (from A) to Q;
ii) f is the restriction of g (from Q) to A;
iii) / is an extension of f (from A) to R;
iv) f is the restriction of h (from R) to A;
v) his an extension of g (from Q) to R; and
vi) g is the restriction of h (from R) to Q.
LetA = {w, x, y, z}, B = {1, 2, 3, 4, 5}, and A; = {w, y, z}. Let f: A> B, g: Ai > B
EXAMPLE 5.18 be represented by the diagrams in Fig. 5.5. Then g = f|,, and f is an extension of g from
A; to A. We note that for the given function g: A; — B, there are five ways to extend g
from A, to A.
258 Chapter 5 Relations and Functions
f:A>B g:A,—~>8
1 a 1
>a
w Ww
@? e?
x
ae
3 03
y y
4 4
Zz Zz
5 5
Figure 5.5
——_
MM i) ANB ii) BNC
1. Determine whether or not each of the following relations is iii) AUC iv) BUC
a function. If a relation is a function, find its range. b) How are the answers for (i)-(iv) affected if A, B, C C
a) {(x, y)|x, y eZ, y =x? +7}, arelation from Z to Z Z*xZt?
b) {(x, y)|x, y ER, y? = x), arelation from R toR 7. Determine each of the following:
c) {(x, y)|x, y €R, y = 3x + 1}, a relation from R toR a) [2.3 — 1.6] b) [2.3] — [1.6] ce) [3.4]|6.2]
d) {(x, y)|x, y €Q, xe + y? = ]}, a relation from Q to Q d) |3.4| [6.2] e) [27] f) 2[2]
e) Ris arelation from A to B where |A| = 5, |B| = 6, and 8. Determine whether each of the following statements is true
IR| = 6. or false. If the statement is false, provide a counterexample.
2. Does the formula f(x) = 1/(x? — 2) define a function a) |a| = [a] for alla € Z.
f:R— R?A function f: Z—> R? b) la] = [a] forallaeR.
3. Let A = {1, 2, 3, 4} and B = {x, y, z}. (a) List five func- c) [a] = [a] —1foralaeR-Z.
tions from A to B. (b) How many functions f: A — B are there? d) —[a] = [—a] foralla eR.
(c) How many functions f: A > B are one-to-one? (d) How
9. Find all real numbers x such that
many functions g: B — A are there? (e) How many functions
g: B > A are one-to-one? (f) How many functions f: A> B a) 71x] = [7x] b) [7x] =7
satisfy f(1) = x? (g) How many functions f: A > B satisfy c) |x +7) =x+7 d) [x +7]
= |x]4+7
fC) = f(2) = x? (h) How many functions f: A > B satisfy 10. Determine all x € R such that |x| + |x + 5! = [2x].
fC) = x and f(2) = y?
11. a) Find all real numbers x where [3x] = 3[x].
4. If there are 2187 functions f: A — B and |B] = 3, what b) Letn € Z* where n > 1. Determine all x € R such that
is |A|?
[nx] =n[x].
5. Let A, B,C CR’ where A = {(x, y)|y =2x + 1}, B= 12. Forn, k € Z*, prove that [n/k] = [(n — 1)/k] +1.
{(x, y)|y = 3x}, and C = {(x, y}|x — y = 7}. Determine each
13. a) Let a €R* where a > 1. Prove that (i) [fa] fa] = 1;
of the following:
and (ii) [la] /a] = 1.
a) ANB b) BNC
b) If ae R* and 0 <a <1, which result(s) in part (a) is
ce) AUC d) BUC (are) true?
6. Let A, B, C CZ’ where A = {(x, y)/y = 2x +1}, B= 14. Let a, G2,Q3,.. . be the integer sequence defined recur-
{(x, y)ly = 3x}, and C = {(x, y)|x — y = 7}. sively by
5.2 Functions: Plain and One-to-One 259
1) a; = 1; and of A is stored in locations 1, 2,3,...,m, respectively, of
2) For all n € Z* wheren > 2, ay = 2ajn/2). the array, when a; 18 stored in location 1. Then the entries
4,2, 1 <i <m, of the second column of A are stored in loca-
a) Determine a, for all 2 <n <8.
tionsm+1,m+2,m+4+3,..., 2m, respectively, of the array,
b) Prove that a, <n foralln eZ. and so on. Find a formula for the access function g(a;,) under
15. For each of the following functions, determine whether it these conditions.
is One-to-one and determine its range. 25. a) Let A be anm X n matrix that is to be stored (in a con-
tiguous manner) in a one-dimensional array of r entries.
a) f:Z—
Z, f(x) = 2x41
Find a formula for the access function if aj, is to be stored
b) f:Q0>
Q, f(x) =2x41 in location k (= 1) of the array [as opposed to location 1 as
ce) f: ZZ,
f(x) =x -—x in Example 5.10(d)] and we use (i) the row major imple-
d) f:R-R, f(x) =e* mentation; (ii) the column major implementation.
e) f:[-72/2, 7/2]
> R, f(x)
= sinx b) State any conditions involving m,n, r, and k that must
be satisfied in order for the results in part (a) to be valid.
f) f: [0,7] > R, f(x) =sinx
26. The following exercise provides a combinatorial proof for
16. Let f: R—> R where f(x) = x”. Determine f(A) for the
a summation formula we have seen in four earlier results:
following subsets A taken from the domain R.
(1) Exercise 22 in Section 1.4; (2) Example 4.4; (3) Exercise 3
a) A = {2, 3} b) A = {-—3, —2, 2, 3} in Section 4.1; and (4) Exercise 19 in Section 4.2.
c) A= (-3, 3) d) A = (—3, 2] Let A = {a,b,c},
B = {1,2,3,...,n,n4+1}, and S=
e) A=[-7, 2] f) A = (—4, —3] VIS, 6] {f: A — B\| f(a) < f(c) and f(b) < f(c)}.
17. Let A = {1, 2,3,4, 5}, B = {w, x, y, z}, Ar = {2, 3, 5} a) IfS; ={f: A— B|f € Sand f(c) = 2}, what is |S)|?
C A, and g: A; > B. In how many ways can g be extended b) IfS. = {f:A— B|f € Sand f(c) = 3}, what is |S>|?
toa function f: A—> B? c) For 1 <i <n, let S$, ={f:A— B\f €S and f(c)=
18. Give an example ofa function f: A— Band A;, Ay CA i + 1}. What is |S,|?
for which f(A, 9 Az) # f(A1) NM f(A2). [Thus the inclusion d) Let 7, = {f: A— B|f € Sand f(a) = f(b)}. Explain
in Theorem 5.2(b) may be proper.] why |7;| = ("3").
19. Prove parts (a) and (c) of Theorem 5,2. e) LetT, = {f:A— B|f € Sand f(a) < f(b)} and 7; =
20. If A = {1, 2, 3, 4, 5} and there are 6720 injective functions {f:A— B\f eS and f(a) > f(b)}. Explain why |7>| =
f: A— B, what is |B|? IT3| = ("5').
21. Let f: A— B, where A= X UY with XN Y =@.If f|x f) What can we conclude about the sets
and f|y are one-to-one, does it follow that f is one-to-one? S, U Sz U $83 U---U S, and 7, U 7, U 73?
22. For ne Z* define X, = {1,2,3,...,n}. Given mneé g) Use the results from parts (c), (d), (e), and (f) to verify
Z*, f: Xm — X,, is called monotone increasing if for alli, j € that
Xm, 1 <i<j<m=> fi) < fV). (a) How many monotone S37 _ a(n + 1)(2n + 1)
increasing functions are there with domain X; and codomain
1=1 6
X5? (b) Answer part (a) for the domain X, and codomain X9.
27. One version of Ackermann’s function A(m,n) is defined re-
(c) Generalize the results in parts (a) and (b). (d) Determine
cursively for m,n € N by
the number of monotone increasing functions f: X\y > X¢
where f(4) = 4. (e) How many monotone increasing functions A(O,n) =n+1,n>0;
ff: X7— Xq2 satisfy f(5) = 9? (f) Generalize the results in A(m, 0) = A(m — 1, 1), m > 0; and
parts (d) and (e). A(m,n) = A(m — 1, A(m,n— 1)), m,n > 0.
23. Determine the access function f (a;,), as described in Ex- [Such functions were defined in the 1920s by the German math-
ample 5.10(d), for a matrix A = (4,;)mxn. where (a) m = 12, ematician and logician Wilhelm Ackermann (1896-1962), who
n= 12; (b)m =7,n = 10; (c)m = 10,n = 7. was a student of David Hilbert (1862-1943). These functions
play an important role in computer science — in the theory of re-
24. For the access function developed in Example 5.10(d),
cursive functions and in the analysis of algorithms that involve
the matrix A = (4,,),.xn Was stored in a one-dimensional ar-
the union of sets.]
ray using the row major implementation. It is also possi-
ble to store this matrix using the column major implemen- a) Calculate A(1, 3) and A(2, 3).
tation, where each entry a,;, 1 <i <m, in the first column b) Prove that A(1, 2) =n +2 foralln EN.
260 Chapter 5 Relations and Functions
c) For all n € N show that A(2, n) = 3 + 2n. thought of as a partial function. The program’s input is the
d) Verify that A(3,n) = 2”*3 — 3 foralln EN. input for the partial function and the program’s output is the
output of the function. Should the program fail to terminate, or
28. Given sets A, B, we define a partial function f with do-
terminate abnormally (perhaps, because of an attempt to divide
main A and codomain B as a function from A’ to B, where f #
by 0), then the partial function is considered to be undefined
A' Cc A. [Here f (x) isnot defined forx € A — A’.] Forexample,
for that input. (a) For A = {1, 2, 3,4, 5}, B = {w, x, y, z},
f:R* > R, where f (x) = 1/x, isa partial function on R since
how many partial functions have domain A and codomain B?
f (Q) is not defined. On the finite side, {(1, x), (2, x), (3, y)} is
(b) Let A, B be sets where |A] =m >0,|B| =n >0. How
a partial function for domain A = {1, 2, 3, 4, 5} and codomain
many partial functions have domain A and codomain B?
B ={w, x, y, 2}. Furthermore, a computer program may be
5.3
Onto Functions: Stirling Numbers
of the Second Kind
The results we develop in this section will provide the answers to the first five problems
stated at the beginning of this chapter. We find that the onto function is the key to all of the
answers.
Definition 5.9 A function f: A > B ts called onto, or surjective, if f(A) = B —that is, if for all be B
there is at least one a € A with f(a) = b.
EXAMPLE 5.19 | The function f: R > R defined by f(x) = x? is an onto function. For here we find that if r
is any real number in the codomain of f, then the real number./r is in the domain of f and
f(r) = (/r) = r. Hence the codomain of f = R = the range of f, and the function f
is onto.
The function g: R > R, where g(x) = x? for each real number x, is not an onto function.
In this case no negative real number appears in the range of g. For example, for —9 to be
in the range of g, we would have to be able to find a real number r with g(r) = r* = —9.
Unfortunately, r* = —9 > r = 3i orr = —3i, where 3i, —3i €C, but 3i, —3i ¢ R. Sohere
the range of g = g(R) = [0, +00) CR, and the function g is not onto. Note, however, that
the function h: R > [0, +00) defined by h(x) = x? is an onto function.
| EXAMPLE 5.20 _| Consider the function f: Z—
f={...,—-8, -5, -2,
Z where f(x) = 3x + 1 for each x € Z. Here the range of
1,4, 7,...} C¢ Z, so f is not an onto function. If we examine the
situation here a little more closely, we find that the integer 8, for example, is not in the range
of f even though the equation
3x +1=8
can be easily solved — giving us x = 7/3. But that is the problem, for the rational number
7/3 is not an integer— so there is no x in the domain Z with f(x) = 8.
On the other hand, each of the functions
1) g:Q—- Q, where g(x) = 3x + 1 forx € Q; and
2) h: R—> R, where h(x) = 3x +1 forx ER
5.3 Onto Functions: Stirling Numbers of the Second Kind 261
is an onto function. Furthermore, 3x; + 1 = 3x2 + 1 => 3x, = 3x2 => x; = X2, regardless
of whether
x, and x2 are integers, rational numbers, or real numbers. Consequently, all three
of the functions f, g, and h are one-to-one.
IfA = {1, 2, 3, 4} and B = {x, y, z}, then
EXAMPLE 5.21
fi ={d, 2), 2, y), 3.x), 4, y)} and fo = {, x), (2, x), (3, y), (4, 2}
are both functions from A onto B. However, the functiong = {(1, x), (2, x), (3, y), (4, y)}
is not onto, because g{A) = {x, y} Cc B.
If A, B are finite sets, then for an onto function f: A — B to possibly exist we must have
|A| => |B]. Considering the development in the first two sections of this chapter, the reader
undoubtedly feels it is time once again to use the rule of product and count the number
of onto functions f: A — B where |A| = m >n = |B|. Unfortunately, the rule of product
proves inadequate here. We shall obtain the needed result for some specific examples and
then conjecture a general formula. In Chapter 8 we shall establish the conjecture using the
Principle of Inclusion and Exclusion.
IfA = {x, y, z} and B = {1, 2}, then all functions f: A > B are onto except fi; = {(x, 1),
EXAMPLE 5.22
(vy, 1), (z, 1}, and fo = {(x, 2), (y, 2), (z, 2)}, the constant functions. So there are
|B|'4| — 2 = 23 — 2 = 6 onto functions from A to B.
In general, if |A| = m > 2 and |B| = 2, then there are 2” — 2 onto functions from A to
B. (Does this formula tell us anything when m = 1?)
For A = {w, x, y, z} and B = {1, 2, 3}, there are 34 functions from A to B. Considering
EXAMPLE 5.23
subsets of B of size 2, there are 2* functions from A to {1, 2}, 2* functions from A to
{2, 3}, and 24 functions from A to {1, 3}. So we have 3(2*) = (5)24 functions from A to
B that are definitely not onto. However, before we acknowledge 3* — (3)2* as the final
answer, we must realize that not all of these (3)2* functions are distinct. For when we
consider all the functions from A to {1,2}, we are removing, among these, the function
{(w, 2), (x, 2), Cy, 2), (z, 2)}. Then, considering the functions from A to {2, 3}, we remove
the same function: {(w, 2), (x, 2), (y, 2), (z, 2)}. Consequently, in the result 34 — (3)2*,
we have twice removed each of the constant functions f: A —> B, where f(A) is one
of the sets {1}, {2}, or {3}. Adjusting our present result for this, we find that there are
3* — (3)2* + 3 = ()3* — (3)2* + (7)1* = 36 onto functions from A to B.
Keeping B = {1, 2, 3}, for any setA with |A| = m > 3, there are (3)3” — (3)2" + G)i"
functions from A onto B. (What result does this formula yield when m = 1? whenm = 27)
The last two examples suggest a pattern that we now state, without proof, as our general
formula.
262 Chapter 5 Relations and Functions
For finite sets A, B with |A| = m and |B| =n, there are
(rer (2 m r eG, 2a)ena
n mo n — ty" n ~~ FY we
a~t
+(-1 (3)2 +(-1pyr f (7)\ ym
_yyn-2{ * \ om
2 1)kt("Je
® _ Eym
k)
*
2,: 1%
1) (,"a ) _.
k) py
onto functions from A to B,
Let A = {1, 2,3,4,5,6, 7} and B = {w, x, y, z}. Applying the general formula with
EXAMPLE 5.24 m = 7 andn = 4, we find that there are
(he -C+Oe-
Oe Berne 4
4
= yo(-p (, ‘) (4 — k)’ = 8400 functions from A onto B.
The result in Example 5.24 is also the answer to the first three questions proposed at the
start of this chapter. Once we remove the unnecessary vocabulary, we recognize that in all
three cases we want to distribute seven different objects into four distinct containers with
no container left empty. We can do this in terms of onto functions.
For Problem 4 we have a sample space & consisting of the 47 = 16,384 ways in which
seven people can each select one of the four floors. (Note that 4’ is also the total number
of functions f: A— B where |A| = 7,|B| = 4.) The event that we are concerned with
contains 8400 of those selections, so the probability that the elevator must stop at every
floor is 8400/16384 = 0.5127, slightly more than half of the time.
Finally, for Problem 5, since }°;_(—1)*(,,",)(n — k)” is the number of onto functions
f: A— B for|A| =m, |B| =n, for the case where m < n there are no such functions and
the summation is 0.
Problem 6 will be addressed in Section 5.6.
Before going on to anything new, however, we consider one more problem.
At the CH Company, Joan, the supervisor, has a secretary, Teresa, and three other adminis-
EXAMPLE 5.25
trative assistants. If seven accounts must be processed, in how many ways can Joan assign
the accounts so that each assistant works on at least one account and Teresa’s work includes
the most expensive account?
First and foremost, the answer is not 8400 as in Example 5.24. Here we must consider
two disjoint subcases and then apply the rule of sum.
a) If Teresa, the secretary, works only on the most expensive account, then the other
six accounts can be distributed among the three administrative assistants in
Vo ieo(—D* (,3,)8 ~— 4° = 540 ways. (540 = the number of onto functions
f: A— B with |A| = 6, |B] = 3.)
5.3 Onto Functions: Stirling Numbers of the Second Kind 263
b) If Teresa does more than just the most expensive account, the assignments can be made
in \of_o(-1)*(44,,)(4 — &)® = 1560 ways. (1560 = the number of onto functions
g:C > Dwith|C| = 6, |D| =4.)
Consequently, the assignments can be given under the prescribed conditions in 540 +
1560 = 2100 ways. [We mentioned earlier that the answer would not be 8400, but it is
(1/4)(8400) = (1/|B|)(8400), where 8400 is the number of onto functions f: A > B,
with |A| = 7 and |B| = 4. This is no coincidence, as we shall learn when we discuss
Theorem 5.3.]
We now continue our discussion with the distribution of distinct objects into containers
with none left empty, but now the containers become identical.
If A = {a, b, c, d} and B = {1, 2, 3}, then there are 36 onto functions from A to B or,
EXAMPLE 5.26 equivalently, 36 ways to distribute four distinct objects into three distinguishable containers,
with no container empty (and no regard for the location of objects in a given container).
Among these 36 distributions we find the following collection of six (one of six such possible
collections of six):
1) {a,b}; {c}2 — (d}s 2) {a,b}; {d}2 {e}3
3) {c}i fa, b}n (d}3 Nich {d}o {a, d}s
5) {d}i (a, b}n {ec} 6) {dhi — {c}2_ fa, D3,
where, for example, the notation {c}2 means that ¢ is in the second container. Now if
we no longer distinguish the containers, these 6 = 3! distributions become identical, so
there are 36/(3!) = 6 ways to distribute the distinct objects a, b, c, d among three identical
containers, leaving no container empty.
For m > n there are )7j.9(—1)*(,,",)(n — &)" ways to distribute m distinct objects into
n numbered (but otherwise identical) containers with no container left empty. Removing
the numbers on the containers, so that they are now identical in appearance, we find
that one distribution inte these n (nonempty) identical containers corresponds with n!
such distributions into the numbered containers. So the number of ways in which it is
possible to distribute the m distinct objects into n identical containers, with no container
left empty, is °
Ani ynt
A (”n-k Jer ky”
This will be denoted by S(m, ) and is called a Stirling number of the second kind.
We note that for [A] = m >n = |B|, there are n! - S(m, n) onto functions from A
to B.
Table 5.1 lists some Stirling numbers of the second kind.
For m >n, >-;_, S(m, i) is the number of possible ways to distribute m distinct objects
EXAMPLE 5.27 | into n identical containers with empty containers allowed. From the fourth row of Table 5.1
264 Chapter 5 Relations and Functions
Table 5.1
S(m, n)
mm | 2 3 4 5 6 7 8
] 1
2 |1 1
3 1 3 1
4 1 7 6 1
5 |1 15 25 10 1
6 1 31 90 65 15 ]
7 1 63 301 350 140 21 1
8 1 127 966 1701 1050 266 28 1
we see that there are 1 + 7+ 6 = 14 ways to distribute the objects a, b, c, d among three
identical containers, with some container(s) possibly empty.
We continue now with the derivation of an identity involving Stirling numbers of the
second kind. The proof is combinatorial in nature.
THEOREM 5.3 Let m, n be positive integers with | <n <m. Then
S(m+1,n) = S(m,n—1)4+nS(m, n).
Proof: Let A = {a), a2,..., Gm, Gm4i}. Then S(m + 1, n) counts the number of ways in
which the objects of A can be distributed among n identical containers, with no container
left empty.
There are S(m,n — 1) ways of distributing a), a2, ..., @, among n — 1 identical con-
tainers, with none left empty. Then, placing a,,,, in the remaining empty container results
in S(m, n — 1) of the distributions counted in S(m + 1, 2) —namely, those distributions
where a+) is in a container by itself. Alternatively, distributing a;, a2, ... , @,, among the
n identical containers with none left empty, we have S(m, n) distributions. Now, however,
for each of these S(m, n) distributions the x containers become distinguished by their con-
tents. Selecting one of the n distinct containers for a4), we have nS(m, n) distributions
of the total S(m + 1, n) —namely, those where a,,,; is in the same container as another
object from A. The result then follows by the rule of sum.
To illustrate Theorem 5.3 consider the triangle shown in Table 5.1. Here the largest num-
ber corresponds with S(m-+ 1,7”), for m = 7 and n = 3, and we see that $(7 + 1, 3) =
966 = 63 + 3(301) = S(7, 2) + 3S(7, 3). The identity in Theorem 5.3 can be used to ex-
tend Table 5.1 if necessary.
If we multiply the result in Theorem 5.3 by (x — 1)! we have
(<) [n!S(m + 1,)] = [Cn — 1)!S(m,n — 1] 4+ [n!SQn,n)].
5.3 Onto Functions: Stirling Numbers of the Second Kind 265
This new form of the equation tells us something about numbers of onto functions. If
A = {a@|, @2,.-., Gm, Gm4i} and B = (by, bo, ..., by_j, b,} with m > n — 1, then
l
(;) (The number of onto functions h: A > B)
n
= (The number of onto functions f: A — {@m41} > B— {b,})
+ (The number of onto functions g: A — {@,41} > B).
Thus the relationship at the end of Example 5.25 is not just a coincidence.
We close this section with an application that deals with a counting problem in which the
Stirling numbers of the second kind are used in conjunction with the Fundamental Theorem
of Arithmetic.
Consider the positive integer 30,030 = 2 X 3 X 5 X 7 X 11 X 13. Among the unordered
EXAMPLE 5.28
factorizations of this number one finds
i) 30X 1001 = (2X3 5)(7X 11 X 13)
ii) 110 X 273 = (2X 5X 11)3X7 X 13)
iii) 2310 X 13 = (2X3 X5X7X 11)(13)
iv) 14 X 33 X 65 = (2 X 7)(3 X 11)(5 X 13)
vy) 22 X 35 x 39 = (2 X 11)(5 X 7)(3 X 13)
The results given in (i), (ii), and (111) demonstrate three of the ways to distribute the six
distinct objects 2, 3,5, 7, 11, 13 into two identical containers with no container left empty. So
these first three examples are three of the $(6, 2) = 31 unordered two-factor factorizations
of 30,030 — that is, there are $(6, 2) ways to factor 30,030 as mn where m,n € Z* for
1 < m,n < 30,030 and where order is not relevant. Likewise, the results in (iv) and (v) are
two of the $(6, 3) = 90 unordered ways to factor 30,030 into three integer factors, each
greater than 1. If we want at least two factors (greater than 1) in each of these unordered
factorizations, then we find that there are }°°_, S(6, i) = 202 such factorizations. If we
want to include the one-factor factorization 30,030
— where we distribute the six distinct
objects 2,3, 5,7, 11, 13 into one (identical) container — then we have 203 such factorizations
in total.
3. For each of the following functions g: R > R, determine
whether the function is one-to-one and whether it is onto. If the
function is not onto, determine the range g(R).
1. Give an example of finite sets A and B with JA], |B| > 4
and a function f: A — B such that (a) f is neither one-to-one a) g(x) =x+7 b) g(x)
= 2x —3
nor onto; (b) f is one-to-one but not onto; (c) f is onto but not c) g(x) = —x +5 d) g(x) =x?
one-to-one; (d) f is onto and one-to-one. e) g(x) =x? +x f) gx) =x
4. Let A = {1, 2,3, 4} and B = {1, 2, 3, 4, 5, 6}. (a) How
2. For each of the following functions f: Z— Z, determine
many functions are there from A to B? How many of these
whether the function is one-to-one and whether it is onto. If the
are one-to-one? How many are onto? (b) How many functions
function is not onto, determine the range f (Z).
are there from B to A? How many of these are onto? How many
a) f(xy) =x4+7 b) f(x) =2x -3 are one-to-one?
c) f(x) =—-x +5 d) f(x) =x? 5. Verify that }°;_)(—1)*(,",)(@ — k)” =0 for n =5 and
e) f(x) =x? 4+x f) fxy=x m = 2, 3,4.
266 Chapter 5 Relations and Functions
6. a) Verify that 5’ = $°°_, F)G)SC, i). or more factors, each greater than 1, where the order of the
b) Provide a combinatorial argument to prove that for all factors is not relevant?
mneZt, 14. Write a computer program (or develop an algorithm) to
compute the Stirling numbers S(m, n) when 1 < m < 12 and
na my, :
m D (7 Jansen. l<n<m.
15. A lock has n buttons labeled 1, 2, . .. , 2. To open this lock
7. a) Let A= {1,2,3,4,5,6, 7} and B= {v, w, x, y, Z}.
we press each of the n buttons exactly once. If no two or more
Determine the number of functions f: A — B where (i)
buttons may be pressed simultaneously, then there are n! ways
F(A) = {v, x}; Gi) | f(A)| = 2; (i) f(A) = {w, x, y}s Gv) to do this. However, if one may press two or more buttons si-
| f(A)| = 3; (v) f(A) = tu, x, y, Zs and (vi) | f(A)| = 4. multaneously, then there are more than n! ways to press all of
b) Let A, Bbesets with |A| =m >n = |B. Ifk € Z* with the buttons. For instance, if n = 3 there are six ways to press
1<k <n, how many functions f: A — B are such that the buttons one at a time. But if one may also press two or more
| f(A)| = k? buttons simultaneously, then we find 13 cases — namely,
8. A chemist who has five assistants is engaged in a research
(1) 1,2,3 (2) 1,3,2 (3) 2,1,3
project that calls for nine compounds that must be synthesized.
(4) 2,3, 1 (5) 3,1,2 (6) 3,2,1
In how many ways can the chemist assign these syntheses to the
(7) {1, 2},3 (8) 3, {1, 2} (9) (1, 3},2
five assistants so that each is working on at least one synthesis?
(10) 2, {1, 3} (11) (2, 3},1 (12) 1, {2, 3}
9. Use the fact that every polynomial equation having real- (13) {1, 2, 3}.
number coefficients and odd degree has a real root in or-
der to show that the function f: R— R, defined by f(x) = (Here, for example, case (12) indicates that one presses button
1 first and then buttons 2, 3 (together) second.] (a) How many
x° — 2x? + x, is an onto function. Is f one-to-one?
ways are there to press the buttons when n = 4? n = 5? How
10. Suppose we have seven different colored balls and four many for 7 in general? (b) Suppose a lock has 15 buttons. To
containers numbered I, I], Il, and IV. (a) In how many ways open this lock one must press 12 different buttons (one at a time,
can we distribute the balls so that no container is left empty? or simultaneously in sets of two or more). In how many ways
(b) In this collection of seven colored balls, one of them is can this be done?
blue. In how many ways can we distribute the balls so that no
container is empty and the blue ball is in container II? (c) If 16. At St. Xavier High School ten candidates C), Co, . . os Cio,
we remove the numbers from the containers so that we can no run for senior class president.
longer distinguish them, in how many ways can we distribute a) How many outcomes are possible where (1) there are no
the seven colored balls among the four identical containers, with ties (that is, no two, or more, candidates receive the same
some container(s) possibly empty? number of votes? (11) ties are permitted? [Here we may
have an outcome such as {C, C3, C3}, {C, C4, Co, Cio},
11. Determine the next two rows (m = 9, 10) of Table 5.1 for
the Stirling numbers S(m, n), where 1 <n < m. {Cs}, {Cs, Cg}, where C>,C3,C; tie for first place,
C,, C4, Co, Cio tie for fourth place, Cs is in eighth place,
12. a) Inhow many ways can 31,100,905 be factored into three and C,, Cg are tied for ninth place.] (iil) three candidates
factors, each greater than 1, if the order of the factors is not
tie for first place (and other ties are permitted)?
relevant?
b) How many of the outcomes in section (i1i) of part (a)
b) Answer part (a), assuming the order of the three factors have C3 as one of the first-place candidates?
is relevant.
c) How many outcomes have C; in first place (alone, or
c) In how many ways can one factor 31,100,905 into two tied with others)?
or more factors where each factor is greater than ] and no
17. Form, n, r € Z* withm > rn, let S,(m, n) denote the num-
regard is paid to the order of the factors?
ber of ways to distribute m distinct objects among n identical
d) Answer part (c), assuming the order of the factors is to containers where each container receives at least r of the ob-
be taken into consideration. jects. Verify that
13. a) How many two-factor unordered factorizations, where
each factor is greater than 1, are there for 156,009? S-m+ 1,2) =nS-(m,n)+ ( m |) seem +1l—-rn-1).
r
b) In how many ways can 156,009 be factored into two
18. We use s(m, n) to denote the number of ways to seat m
or more factors, each greater than 1, with no regard to the
people at n circular tables with at least one person at each table.
order of the factors?
The arrangements at any one table are not distinguished if one
c) Let pj, p2, p3,..., Pn be n distinct primes. In how can be rotated into another (as in Example !.16). The ordering
many ways can one factor the product It, P, into two of the tables is not taken into account. For instance, the arrange-
5.4 Special Functions 267
OQOC
ments in parts (a), (b), (c) of Fig. 5.6 are considered the same;
those in parts (a), (d), (e) are distinct (in pairs).
The numbers s(m, 7) are referred to as the Stirling numbers
of the first kind.
a) If > m, what is s(m, n)?
b) For m > 1, what are s(m, m) and s(m, 1)?
(a) (b)
QQOC
c) Determine s(m, m — 1) form > 2.
d) Show that for m > 3,
1
s(m,m—2)= (=) m(m — 1)(m — 2)(3m — 1).
19. As in the previous exercise, s(m, 1) denotes a Stirling num-
ber of the first kind.
a) Form >n > } prove that
(c) (d)
OC
s(m,n) = (m — l)sQm — 1,n)4+s(m—1,n —1).
b) Verify that for m > 2,
mal
v(m, 2) = (m=! =
r=]
(e)
Figure 5.6
5.4
Special Functions
In Section 2 of Chapter 3 we mentioned that addition is a closed binary operation on the
set Z*, whereas 1M is a closed binary operation on P(A) for any given universe UL. We also
noted in that section that “taking the minus” of an integer is a unary operation on Z. Now it
is time to make these notions of (closed) binary and unary operations more precise in terms
of functions.
Definition 5.10 For any nonempty sets A, B, any function f: A X A > B is called a binary operation on
A. If B CA, then the binary operation is said to be closed (on A). (When B C A we may
also say that A is closed under f.)
Definition 5.11 A function g: A > A is called a unary, or monary, operation on A.
| EXAMPLE 5.29 | a) The function f: Z x Z — Z, defined by f(a, b) = a — b, is aclosed binary operation
on Z.
b) Ifg: Z* X Z* —> Zis the function where g(a, b) = a — b, theng is a binary operation
on Zt, but it is not closed. For example, we find that 3,7 ¢ Z*, but g(3, 7) =3-—7=
—4¢Z".
c) The function h: Rt — R* defined by h(a) = 1/a is a unary operation on R*.
268 Chapter 5 Relations and Functions
Let U be a universe, and let A, B CU. (a) If f: POU) X POU) > APU) is defined by
EXAMPLE 5.30 f(A, B) = AUB, then f is a closed binary operation on PU). (b) The function
g: POU) > POU) defined by g(A) = A is a unary operation on P(A).
Definition 5.12 Let f: A X A — B; that is, f is a binary operation on A.
a) f is said to be commutative if f(a, b) = f(b, a) forall (a,b) Ee AX A.
b) When B C A (that is, when f is closed), f is said to be associative if for all a, b,c €
A, f(f (a, b), c) = fla, f(b, c)).
The binary operation of Example 5.30 is commutative and associative, whereas the binary
EXAMPLE 5.31
operation in part (a) of Example 5.29 is neither.
EXAMPLE 5.32 a) Define the closed binary operation f: Z X Z— Z by f(a, b) =a+b— 3ab. Since
. both the addition and the multiplication of integers are commutative binary operations,
it follows that
f(a, b) =a+b—-3ab=b+a-—
3ba = f(b, a),
so f is commutative.
To determine whether f is associative, consider a, b, c € Z. Then
f(a,b)=a+b—3ab and f(f(a,b),c) = fla, b)+c—3f
(a, b)c
= (a+b -—3ab)+c—3(a4+
b — 3ab)c
=a+b-+c—3ab
— 3ac — 3bc 4+ 9abe,
whereas
f(b,c)=b+c-—3be and fa, f(b,c))=at+ f(b, c) —3af(b, c)
=a+(b4+c-—
3bc) — 3a(b+c — 3bc)
=a+b-+c
— 3ab — 3ac — 3bc + Yabe.
Since f( f(a, b), c) = f(a, f(b, c)) for all a, b, c € Z, the closed binary operation
f is associative as well as commutative.
b) Consider the closed binary operation h: Z X Z— Z, where h(a, b) = a|b|. Then
h(3, —2) = 3|— 2| = 3(2) = 6, but A(—2, 3) = —2|3| = —6. Consequently, h is not
commutative. However, with regard to the associative property, if a, b, c € Z, we find
that
A(h(a, b), c) = Ala, b)\c| = albl|c| and
h(a, h(b, c)) = alh(b, c)| = alblc|| = allel,
so the closed binary operation / is associative.
IfA = {a, b,c, d}, then|A X A| = 16. Consequently, there are 4!° functions f: A X A>
EXAMPLE 5.33
A; that is, there are 4!° closed binary operations on A.
To determine the number of commutative closed binary operations g on A, we realize
that there are four choices for each of the assignments g(a, a), g(b, b), g(c, c), and g(d, d).
5.4 Special Functions 269
We are then left with the 4* — 4 = 16 — 4 = 12 other ordered pairs (in A X A) of the form
(x, y), x # y. These 12 ordered pairs must be considered in sets of two in order to insure
commutativity. For example, we need g(a, b) = g(b, a) and may select any one of the four
elements of A for g(a, b). But then this choice must also be assigned to g(b, a). Therefore,
since there are four choices for each of these 12/2 = 6 sets of two ordered pairs, we find
that the number of commutative closed binary operations g on A is 4* - 4° = 4!°,
Definition 5.13 Let f: A X A — B bea binary operation on A. An element x € A is called an identity (or
identity element) for f if f(a, x) = f(x, a) =a, forallae A.
a) Consider the (closed) binary operation f: Z X Z— Z, where f(a, b) = a + b. Here
EXAMPLE 5.34
the integer 0 is an identity since f(a,0) =a+0=0+4a = f(0, a) =a, for each
integer a.
b) We find that there is no identity for the function in part (a) of Example 5.29. For if f
had an identity x, then for any a € Z, f(a,x) =as>a-—x=a>x =0. But then
f(@, a) = f(0,a) =0~a #a, unless a = 0.
c) Let A = {1, 2,3, 4,5, 6, 7}, and let g: A X A — A be the (closed) binary operation
defined by g(a, b) = min{a, b}— that is, the minimum (or smaller) of a, b. This
binary operation is commutative and associative, and for any a € A we have g(a, 7) =
min{a, 7} = a = min{7, a} = g(7, a). So 71s an identity element for g.
In parts (a) and (c) of Example 5.34 we examined two (closed) binary operations, each
of which has an identity. Part (b) of that example showed that such an operation need not
have an identity element. Could a binary operation have more than one identity? We find
that the answer is no when we consider the following theorem.
THEOREM 5.4 Let f: A X A — B bea binary operation. If f has an identity, then that identity is unique.
Proof: If f has more than one identity, let x;, x2 € A with
f(a, x,;)=a= f(x,a), forallae
A, and
f(a, x2) =a= fl(x2,a), forallaeA.
Consider x; as an element of A and x2 as an identity. Then f(x), x2) = x,. Now reverse
the roles of x; and xz— that is, consider x2 as an element of A and x, as an identity. We
find that f(x), x2) = x2. Consequently, x; = x2, and f has at most one identity.
Now that we have settled the issue of the uniqueness of the identity element, let us see
how this type of element enters into one more enumeration problem.
If A = {x, a, b, c, d}, how many closed binary operations on A have x as the identity?
EXAMPLE 5.35
Let f: AX A> A with f(, y) = y = f(y, x) for all y € A. Then we may represent
f by a table as in Table 5.2. Here the nine values, where x is the first component — as in
(x, c), or the second component — as in (d, x), are determined by the fact that x is the
identity element. Each of the 16 remaining (vacant) entries in Table 5.2 can be filled with
any one of the five elements in A.
270 Chapter 5 Relations and Functions
Table 5.2
&
MNorsa&
STS
|
|
|
|
Na
Hence there are 5'° closed binary operations on A where x is the identity. Of these 5!° =
2 . . 7 .
54. 5(4°—4)/2 are commutative. We also realize that there are 5'° closed binary operations
on A where b is the identity. So there are 5!7 = (7)5'6 = (9)5°-PO-" = (7)50-D" closed
binary operations on A that have an identity, and of these 5'! = (?)5!° = (3545 9/2 are
commutative.
Having seen several examples of functions (in Examples 5.16(b), 5.29, 5.30, 5.32, 5.33,
5.34, and 5.35) where the domain is a cross product of sets, we now investigate functions
where the domain is a subset of a cross product.
Definition 5.14 For sets A and B, if DC A X B, then 2,4: D — A, defined by z4(a, b) = a, is called the
projection on the first coordinate. The function 7g: D — B, defined by zg(a, b) = b, is
called the projection on the second coordinate.
We note that if D = A X B then zr, and zp are both onto.
EXAMPLE 5.36 | If A = {w, x, y} andB = {1, 2, 3, 4}, let D = {(, 1), (x, 2), (x, 3), Gy, DD. Cy, 4}. Then
the projection 74: D > A satisfies m4(x, 1) = m4(x, 2) = wax, 3) = x, and ma4(y, 1) =
a(y, 4) = y. Since 14(D) = {x, y} C A, this function is not onto.
For zg: D— B we find that wg(x, 1) = mg(y, 1) = 1, wax, 2) = 2, mex, 3) = 3,
and zg(y, 4) = 4, so 7g(D) = B and this projection is an onto function.
Let A = B =R and consider the set D C A X B where D = {(x, y)|y = x7}. Then D
EXAMPLE 5.37
represents the subset of the Euclidean plane that contains the points on the parabola y = x?.
Among the infinite number of points in D we find the point (3, 9). Here 2,4 (3, 9) = 3,
the x-coordinate of (3, 9), whereas 73(3, 9) = 9, the y-coordinate of the point.
For this example, 74(D) = R = A,sozrq is onto. (The projection 4 is also one-to-one.)
However, 73(D) = [0, +co) C R, so 7g is not onto. [Nor is it one-to-one
— for example,
mp(2, 4) = 4 = mp(-2, 4).]
We now extend the notion of projection as follows. Let A;, A2,..., A, be sets, and
{ij, fo,..., im} EC {1, 2,...,a} with i) <in9 <--+- <i, and m<n. If DCA; X AX
‘++ X A, =X¥_, A;, then the function 2: D— Aj, X A;, X--* X Aj, defined by
™(Q1, 42,..., Gn) = (4,,, 4, .--, G,,) 18 the projection of D on the ith, isth, ..., i,th
coordinates. The elements of D are called (ordered) n-tuples; an element in 2(D) is an
(ordered) m-tuple.
5.4 Special Functions 271
These projections arise in a natural way in the study of relational data bases, a standard
technique for organizing and describing large quantities of data by modern large-scale
computing systems. In situations like credit card transactions, not only must existing data
be organized but new data must be inserted, as when credit cards are processed for new
cardholders. When bills on existing accounts are paid, or when new purchases are made on
these accounts, data must be updated. Another example arises when records are searched
for special considerations, as when a college admissions office searches educational records
seeking, for its mailing lists, high school students who have demonstrated certain levels of
mathematical achievement.
The following example demonstrates the use of projections in a method for organizing
and describing data on a somewhat smaller scale.
At a certain university the following sets are related for purposes of registration:
EXAMPLE 5.38
A, = the set of course numbers for courses offered in mathematics.
A» = the set of course titles offered in mathematics.
A3 = the set of mathematics faculty.
Ag = the set of letters of the alphabet.
Consider the table, or relation,’ D C Ay X A> X A3 X Ag given in Table 5.3.
Table 5.3
Course Number Course Title Professor Section Letter
MA 111 Calculus I P. Z. Chinn A
MA 111 Calculus I V. Larney B
MA 112 Calculus II J. Kinney A
MA 112 Calculus II A. Schmidt B
MA 112 Calculus II R. Mines C
MA 113 Calculus II J. Kinney A
The sets Ay, Az, A3, Aq are called the domains of the relational data base, and table D
is said to have degree 4. Each element of D is often called a list.
The projection of Don A; X A3 X Aq is shown in Table 5.4. Table 5.5 shows the results
for the projection of D on A; X Ap.
Table 5.4 Table 5.5
Course Number Professor Section Letter Course Number Course Title
MA 111 P. Z. Chinn A MA 111 Calculus I
MA 111 V. Larney B MA 112 Calculus II
MA 112 J. Kinney A MA 113 Calculus III
MA 112 A. Schmidt B
MA 112 R. Mines C
MA 113 J. Kinney A
"Here the relation D is not binary. In fact, D is a quaternary relation.
272 Chapter 5 Relations and Functions
Tables 5.4 and 5.5 are another way of representing the same data that appear in
Table 5.3. Given Tables 5.4 and 5.5, one can recapture Table 5.3.
The theory of relational] data bases is concerned with representing data in different ways
and with the operations, such as projections, needed for such representations. The computer
implementation of such techniques 1s also considered. More on this topic is mentioned in
the exercises and chapter references.
8. Let A = {2, 4, 8, 16, 32}, and consider the closed binary
EXERCISES 5.4 operation f: A X A — A where f(a, b) = gcd(a, b). Does f
have an identity element?
1. For A = {a, b, c}, let f: A X A — A be the closed binary
9. For distinct primes p,q let A= {p"q"|O<m <3],
operation given in Table 5.6. Give an example to show that f
0 <n < 37}. (a) What is |A|? (b) If f: AX A> A is the
is not associative.
closed binary operation defined by f(a, b) = gced(a, b), does
Table 5.6 f have an identity element?
fla be 10. State a result that generalizes the ideas presented in the
previous two exercises.
aitboaeoe 11. For 6 # A CZ", let f.g: A X A— A be the closed bi-
bia cb nary Operations defined by f(a, b) = min{a, b} and g(a, b) =
ctc boa max{a, b}. Does f have an identity element? Does ¢?
12. Let A = B = R. Determine z,4(D) and 23 (D) for each of
2. Let f: R X R — Z be the closed binary operation defined the following sets DC A X B.
by f(a, b) = [a + Bb]. (a) Is f commutative? (b) Is f associa-
tive? (c) Does f have an identity element?
a) D= {(x, y)|x = y?}
3. Each of the following functions f: Z x Z— Zis aclosed
b) D = {(%, y)ly = sin x}
binary operation on Z. Determine in each case whether f is c) D={(x, yx? +y? = 1}
commutative and/or associative. 13. Let A,. 1 <i <5, be the domains for a table DC A; X
a) f(x,y) =x+y—xy Az X A3 X Ag X As, where A; = {U, V, W, X, Y. Z} (used as
code names for different cereals in atest), and Ay = A3 = Ag =
b) f(x, y) = max{x, y}, the maximum (or larger) of x, y
As = Z*. The table D is given as Table 5.7.
c) f(x,y) = x* a) What is the degree of the table?
d) f®%, y)=x+y—-3
b) Find the projection of D on A3 X Aq X As.
4, Which of the closed binary operations in Exercise 3 have
c) Adomain of a table is called a primary key for the table
an identity?
if its value uniquely identifies each list of D. Determine the
5. Let |A| =5. (a) What is {A X Al? (b) How many primary key(s) for this table.
functions f: A X A — A are there? (c) How many closed bi-
14. Let A,, 1 <2 <5, be the domains for a table D C A, X
nary operations are there on A? (d) How many of these closed
Az X A3 X Aq X As, where A, = {1, 2} (used to identify the
binary operations are commutative?
daily vitamin capsule produced by two pharmaceutical compa-
6. Let A = {x, a, b. c, d}. nies), Az = {A, D, E}, and A3 = Ay = As = Z*. The table D
a) How many closed binary operations f on A satisfy is given as Table 5.8.
f(a, b)=c? a) What is the degree of the table?
b) How many of the functions f in part (a) have x as an b) What is the projection of D on A, X Az? on A; X
identity? Ag X As?
c) How many of the functions f in part (a) have an iden- c) This table has no primary key. (See Exercise 13.) We
tity? can, however, define a composite primary key as the cross
d) How many of the functions f in part (c) are commuta- product of a minimal number of domains of the table, whose
tive? components, taken collectively, uniquely identify each list
7, Let f:Z* X Z* > Z* be the closed binary operation de- of D. Determine some composite primary keys for this
fined by f(a, b) = gcd(a, b). (a) Is f commutative? (b) Is f table.
associative? (c) Does f have an identity element?
5.5 The Pigeonhole Principle 273
Table 5.7
Grams of % of RDA‘ of % of RDAof | % of RDA of
Code Name Sugar per Vitamin A per | Vitamin C per | Protein per
of Cereal 1-oz Serving 1-oz Serving 1-oz Serving 1-oz Serving
U 1 25 25 6
Vv 7 25 2 4
WwW 12 25 2 4
xX 0 60 40 20
Y 3 25 40 10
Z 2 25 40 10
“RDA = recommended daily allowance
Table 5.8
Vitamin | Vitamin Present | Amount of Vitamin Dosage: No. of Capsules
Capsule in Capsule in Capsule in IU’ Capsules / Day per Bottle
1 A 10,000 ] 100
1 D 400 ] 100
1 E 30 1 100
2 A 4,000 1 250
2 D 400 ] 250
2 E 15 I 250
“TU = international units
5.5
The Pigeonhole Principle
A change of pace is in order as we introduce an interesting distribution principle. This
principle may seem to have nothing in common with what we have been doing so far, but
it will prove to be helpful nonetheless.
In mathematics one sometimes finds that an almost obvious idea, when applied in a
rather subtle manner, is the key needed to solve a troublesome problem. On the list of such
obvious ideas many would undoubtedly place the following rule, known as the pigeonhole
principle.
The Pigeonhole Principle: If m pigeons occupy n pigeonholes and m > n, then at
least one pigeonhole has two or more pigeons roosting in it.
One situation for 6 (= m) pigeons and 4 (= n) pigeonholes (actually birdhouses) is shown
in Fig. 5.7. The general result readily follows by the method of proof by contradiction. If
the result is not true, then each pigeonhole has at most one pigeon roosting in it—for a
total of at most n (< m) pigeons. (Somewhere we have lost at least m — n pigeons!)
But now what can pigeons roosting in pigeonholes have to do with mathematics—
discrete, combinatorial, or otherwise? Actually, this principle can be applied in various
problems in which we seek to establish whether a certain situation can actually occur. We
274 Chapter 5 Relations and Functions
Figure 5.7
illustrate this principle in the following examples and shall find it useful in Section 5.6 and
at other points in the text.
An office employs 13 file clerks, so at least two of them must have birthdays during the
EXAMPLE 5.39
same month. Here we have 13 pigeons (the file clerks) and 12 pigeonholes (the months of
the year).
Here is a second rather immediate application of our principle.
Larry returns from the laundromat with 12 pairs of socks (each pair a different color) in a
EXAMPLE 5.40
laundry bag. Drawing the socks from the bag randomly, he’ll have to draw at most 13 of
them to get a matched pair.
From this point on, application of the pigeonhole principle may be more subtle.
Wilma operates a computer with a magnetic tape drive. One day she is given a tape that
EXAMPLE 5.41
contains 500,000 “words” of four or fewer lowercase letters. (Consecutive words on the
tape are separated by a blank character.) Can it be that the 500,000 words are all distinct?
From the rules of sum and product, the total number of different possible words, using
four or fewer letters, is
267 + 26° + 26° + 26 = 475,254.
With these 475,254 words as the pigeonholes, and the 500,000 words on the tape as the
pigeons, it follows that at least one word is repeated on the tape.
Let S Cc Zt, where |S| = 37. Then S contains two elements that have the same remainder
EXAMPLE 5.42
upon division by 36.
Here the pigeons are the 37 positive integers in $. We know from the division algorithm
(of Theorem 4.5) that when any positive integer n is divided by 36, there exists a unique
quotient g and unique remainder r, where
n = 36g +r, O<r
< 36.
The 36 possible values of r constitute the pigeonholes, and the result is now established by
the pigeonhole principle.
5.5 The Pigeonhole Principle 275
Prove that if 101 integers are selected from the set S = {1, 2, 3, ..., 200}, then there are
EXAMPLE 5.43
two integers such that one divides the other.
For each x € S, we may write x = 2°y, with k>0, and ged(2, y) = 1. (This result
follows from the Fundamental Theorem of Arithmetic.) Then y must be odd, so y é€
T = {1,3,5,..., 199}, where |7| = 100. Since 101 integers are selected from S$, by the
pigeonhole principle there are two distinct integers of the form a = 2”y, b = 2” y for
some (the same) y € 7. Ifm <n, then a|b; otherwise, we have m > n and then bla.
Any subset of size 6 from the set S = {1, 2, 3, ..., 9} must contain two elements whose
EXAMPLE 5.44
sum is 10.
Here the pigeons constitute a six-element subset of {1, 2, 3,...,9}, and the pigeon-
holes are the subsets {1, 9}, {2, 8}, {3, 7}, {4, 6}, {5}. When the six pigeons go to their
respective pigeonholes, they must fill at least one of the two-element subsets whose members
sum to 10.
Triangle ACE is equilateral with AC = 1. If five points are selected from the interior of
EXAMPLE 5.45
the triangle, there are at least two whose distance apart is less than 1/2.
For the triangle in Fig. 5.8, the four smaller triangles are congruent equilateral triangles
and AB = 1/2. We break up the interior of triangle AC E into the following four regions,
which are mutually disjoint in pairs:
Figure 5.8
R,: the interior of triangle BC D together with the points on the segment B D, excluding
B and D.
R>: the interior of triangle ABF.
R3: the interior of triangle BDF together with the points on the segments BF
and DF, excluding B, D, and F.
R4: the interior of triangle FDE.
Now we apply the pigeonhole principle. Five points in the interior of triangle AC E must
be such that at least two of them are in one of the four regions R;, 1 <i < 4, where any two
points are separated by a distance less than 1/2.
Let S be a set of six positive integers whose maximum ts at most 14. Show that the sums
EXAMPLE 5.46
of the elements in all the nonempty subsets of S$ cannot all be distinct.
For each nonempty subset A of S, the sum of the elements in A, denoted s4, satisfies
L<s,<9+10+---+14 = 69, and there are 2° — 1 = 63 nonempty subsets of S. We
276 Chapter 5 Relations and Functions
should like to draw the conclusion from the pigeonhole principle by letting the possible
sums, from 1 to 69, be the pigeonholes, with the 63 nonempty subsets of S$ as the pigeons,
but then we have too few pigeons.
So instead of considering all nonempty subsets of S, we cut back to those nonempty
subsets A of S where |A| <5. Then for each such subset A it follows that 1 <s4 < 10+
11+ .---+ 14 = 60. There are 62 nonempty subsets A of S with |A| <5 —namely, all the
subsets of S except for 4 and the set S itself. With 62 pigeons (the nonempty subsets A of
S where |A| < 5) and 60 pigeonholes (the possible sums 54), it follows by the pigeonhole
principle that the elements of at least two of these 62 subsets must yield the same sum.
Let m € Z* with m odd. Prove that there exists a positive integer n such that m divides
i EXAMPLE 5.47 2" — 1.
Consider the m+ 1 positive integers 2'-— 1,27-—1,23~1,...,2”~-—1,2"*!-1.
By the pigeonhole principle and the division algorithm there exist s,t¢ Z* with 1<
s<t<m+1, where 2° —1 and 2' — 1 have the same remainder upon division by m.
Hence 25 — 1 = gym + rand2‘ — 1 = qam +r, for gq), g2 € N, and (2' — 1) — (22-1) =
(gam +r) — (qym +r),so2' — 2° = (g2 — qi)m. But2' — 2° = 2°(2'~* — 1), and sincem
is odd, we have gcd(2*, m) = 1. Hence m|(2'~* — 1), and the result follows with n =f —s.
While on a four-week vacation, Herbert will play at least one set of tennis each day, but he
EXAMPLE 5.48
won't play more than 40 sets total during this time. Prove that no matter how he distributes
his sets during the four weeks, there is a span of consecutive days during which he will play
exactly 15 sets.
For 1 <i < 28, let x, be the total number of sets Herbert will play from the start of
the vacation to the end of the ith day. Then 1 < x; < x2 <--+ < x23 < 40, and x, + 15 <
+++ << Xog + 15 <55. We now have the 28 distinct numbers x;, x2,..., Xog and the 28
distinct numbers x; + 15, x. + 15, ..., x2g + 15. These 56 numbers can take on only 55
different values, so at least two of them must be equal, and we conclude that there exist
1<j <i < 28 with x, = x; + 15. Hence, from the start of day j + 1 to the end of day /,
Herbert will play exactly 15 sets of tennis.
Our last example for this section deals with a classic result that was first discovered in
1935 by Paul Erdos and George Szekeres.
Let us start by considering two particular examples:
EXAMPLE 5.49
1) Note how the sequence 6, 5, 8, 3, 7 (of length 5) contains the decreasing subsequence
6, 5, 3 (of length 3).
2) Now note how the sequence 11, 8, 7, 1, 9, 6, 5, 10, 3, 12 (of length 10) contains the
increasing subsequence 8, 9, 10, 12 (of length 4).
These two instances demonstrate the general result: For each n € Z*, a sequence of n? + |
distinct real numbers contains a decreasing or increasing subsequence of length n + 1.
To verify this claim let a), az, ... , @,24, be a sequence of n? + 1 distinct real numbers.
For 1<k <n? +1, let
xX; = the maximum length of a decreasing subsequence that ends with a;,, and
yg = the maximum length of an increasing subsequence that ends with ax.
5.5 The Pigeonhole Principle 277
For instance, our second particular example would provide
kK | 1 2 3 4 5 6 7 8 9 10
ak 1 8 7 1 9 6 5 10 3 = 12
Xk 1 2 3 4 2 4 5 2 6 1
Yk 1 1 1 1 2 2 2 3 2 4
If, in general, there is no decreasing or increasing subsequence of length n + 1, then 1 <
xx <nand 1 < y <n forall 1 <k <n* + 1. Consequently, there are at most n? distinct
ordered pairs (xz, yz). But we have n? + 1 ordered pairs (x;, yg), since 1 < k <n?+1.So0
the pigeonhole principle implies that there are two identical ordered pairs (x;, yi), (xj, yj),
wherei # j —sayi < j. Now the real numbers q@, a2, ... , @,24 are distinct, so ifa; < aj
then y; < y;, while if a; <a; then x; > x;. In either case we no longer have (x;, yi) =
(x;, yj). This contradiction tells us that x, =n +1 or y, =n+4+1forsomen+1<k<
n* +1; the result then follows.
For an interesting application of this result, consider n* + 1 sumo wrestlers facing for-
ward and standing shoulder to shoulder. (Here no two wrestlers have the same weight.) We
can select n + 1 of these wrestlers to take one step forward so that, as they are scanned from
left to right, their successive weights either decrease or increase.
b) Let SC Z* X Z*. Find the minimal value of |5|
EXERCISES 5.5 that guarantees the existence of distinct ordered pairs
(x1, X2), (V1, y2) € S such thatx, + y, and x2 + y2 are both
1. In Example 5.40, what plays the roles of the pigeons and
even.
of the pigeonholes?
c) Extending the ideas in parts (a) and (b), consider § C
2. Show that if eight people are in a room, at least two of them
Z* XZ X Z*. What size must |5| be to guarantee the ex-
have birthdays that occur on the same day of the week.
istence
of distinct ordered triples (x;, x2, x3), (1, ¥2. ¥3) €
3. An auditorium has a seating capacity of 800. How many S where x, + yi, X2 + y2, and x3 + y3 are all even?
seats must be occupied to guarantee that at least two people
d) Generalize the results of parts (a), (b), and (c).
seated in the auditorium have the same first and last initials?
e) A point P(x, y) in the Cartesian plane is called
4, Let S = (3,7, 11, 15, 19,..., 95, 99, 103}. How many
a lattice point if x, yé€Z. Given distinct lattice
elements must we select from S$ to insure that there will be
points P, (x1, y1), Po(x2, y2),..., P,,(Xn, Yn), determine
at least two whose sum is 110?
the smallest value of n that guarantees the existence of
5. a) Prove that if 151 integers are selected from {1, 2, 3, P.(x,, ¥,), P(x,, y,), l<i<j<n, such that the mid-
..., 300}, then the selection must include two integers
x, y point of the line segment connecting P,(x,, y,) and
where x|y or y|x. P,(x;, y,) 1s also a lattice point.
b) Write a statement that generalizes the results of part (a) 9. a) If 1] integers are selected from {1, 2,3,..., 100},
and Example 5.43. prove that there are at least two, say x and y, such that
6. Prove that if we select 101 integers from the set S = O0<|/x — /y| <1.
{1,2,3,..., 200}, there exist m,n in the selection where b) Write a statement that generalizes the result of part (a).
gecd(m, n) = 1.
10. Let triangle ABC be equilateral, with AB = 1. Show that
7. a) Show that if any 14 integers are selected from the set if we select 10 points in the interior of this triangle, there must
S = {1,2,3,..., 25}, there are at least two whose sum be at least two whose distance apart is less than 1/3.
is 26.
11. Let ABCD be a square with AB = 1. Show that if we se-
b) Write a statement that generalizes the results of part (a) lect five points in the interior of this square, there are at least
and Example 5.44. two whose distance apart is less than 1//2.
8. a) If SC Z* and |S| > 3, prove that there exist distinct 12. Let AC {1, 2, 3,..., 25} where |A| = 9. For any subset
x, y € S where x + y is even. B of A let sg denote the sum of the elements in B. Prove that
278 Chapter 5 Relations and Functions
there are distinct subsets C, D of A such that |C| = |Dj| =5 19. For k,n €Z*, prove that if kn +1 pigeons occupy n
and Sc = Sp. pigeonholes, then at least one pigeonhole has k + 1 or more
13. Let S be a set of five positive integers the maximum of pigeons roosting in it.
which is at most 9, Prove that the sums of the elements in all 20. How many times must we roll a single die in order to get
the nonempty subsets of S cannot all be distinct. the same score (a) at least twice? (b) at least three times? (c) at
14, During the first six weeks of his senior year in college, least n times, for n > 4?
Brace sends out at least one resumé each day but no more than 21. a) Let Sc Z*. What is the smallest value for |S| that guar-
60 resumés in total. Show that there is a period of consecutive antees the existence of two elements x, y € S where x and
days during which he sends out exactly 23 resumés. y have the same remainder upon division by 1000?
15. Let Sc Z* with |S| = 7. For@ # AC S, let s4 denote the b) What is the smallest value of n such that whenever § C
sum of the elements in A. If m is the maximum element in S, Z* and |S| = n, then there exist three elements x, y, z € §
find the possible values of m so that there will exist distinct where all three have the same remainder upon division by
subsets B, C of S with sg = Sc. 1000?
16, Let &k € Z*. Prove that there exists a positive integer n such c) Write a statement that generalizes the results of parts (a)
that k|n and the only digits in n are 0’s and 3’s. and (b) and Example 5.42.
17. a) Find a sequence of four distinct real numbers with no 22. For m,n € Z*, prove that if m pigeons occupy n pigeon-
decreasing or increasing subsequence of length 3. holes, then at least one pigeonhole has | (m — 1)/n] + 1 ormore
b) Find a sequence of nine distinct real numbers with no pigeons roosting in it.
decreasing or increasing subsequence of length 4. 23. Let pi, po,..-, Pn € Z*. Prove that if py + po+---+
c) Generalize the results in parts (a) and (b). Pn — "+ 1 pigeons occupy # pigeonholes, then either the first
pigeonhole has p, or more pigeons roosting in it, or the second
d) What do the preceding parts of this exercise tell us about
pigeonhole has p2 or more pigeons roosting in it, ..., or the
Example 5.49?
nth pigeonhole has p, or more pigeons roosting in it.
18, The 50 members of Nardine’s aerobics class line up to get
24. Given 8 Perl books, 17 Visual BASIC’ books, 6 Java books,
their equipment. Assuming that no two of these people have the
12 SQL books, and 20 C++ books, how many of these books
same height, show that eight of them (as the line is equipped
must we select to insure that we have 10 books dealing with the
from first to last) have successive heights that either decrease
same computer language?
or increase.
5.6
Function Composition
and Inverse Functions
When computing with the elements of Z, we find that the (closed binary) operation of
addition provides a method for combining two integers, say a and b, into a third integer,
namely a + b. Furthermore, for each integer c there is a second integer d where c + d =
d+c=0,and we call d the additive inverse of c. (It is also true that c is the additive inverse
of d.)
Turning to the elements of R and the (closed binary) operation of multiplication, we
have a method for combining any r, s € R into their product rs. And here, for each t € R,
if ¢ # 0, then there is a real number uw such that ut = tu = 1. The real number uw is called
the multiplicative inverse of t. (The real number f¢ is also the multiplicative inverse of u.)
In this section we first study a method for combining two functions into a single function.
Then we develop the concept of the inverse (of a function) for functions with certain
properties. To accomplish these objectives, we need the following preliminary ideas.
‘Visual BASIC is a trademark of the Microsoft Corporation.
5.6 Function Composition and Inverse Functions 279
Having examined functions that are one-to-one and those that are onto, we turn now to
functions with both of these properties.
Definition 5.15 If f: A > B, then f is said to be bijective, or to be a one-to-one correspondence, if f is
both one-to-one and onto.
IfA = {1, 2, 3, 4} and B = {w, x, y, z}, thenf = {(1, w), (2, x), G, y), (4, z)} isa one-
EXAMPLE 5.50
to-one correspondence from A (on)to B, and g = {(w, 1), (x, 2), (y, 3), (z, 4)} is a one-
to-one correspondence from B (on)to A.
It should be pointed out that whenever the term correspondence was used in Chapter 1
and in Examples 3.11 and 4.12, the adjective one-to-one was implied though never stated.
For any nonempty set A there is always a very simple but important one-to-one corre-
spondence, as seen in the following definition.
Definition 5.16 The function 14: A — A, defined by 14(a) = a for alla € A, is called the identity function
for A.
Definition 5.17 If f, g: A— B, we say that f and g are equal and write f = g, if f(a) = g(a) for all
aca.
A common pitfall in dealing with the equality of functions occurs when f and g are
functions with a common domain A and f(a) = g(a) for all a € A. It may not be the case
that f = g. The pitfall results from not paying attention to the codomains of the functions.
Let f: Z— Z, g:Z— Qwhere f(x) = x = g(x), forallx € Z. Then f, g share the com-
EXAMPLE 5.51
mon domain Z, have the same range Z, and act the same on every element of Z. Yet
Ff # g! Here f is a one-to-one correspondence, whereas g is one-to-one but not onto; so
the codomains do make a difference.
Consider the functions f, g: R > Z defined as follows:
EXAMPLE 5.52
Xx, ifxeZ
f(x) = nj +1. ifxeR—-Z eR
g(x) = [x], forallx
Ifx € Z, then f(x) = x = [x] = g(x).
Forx € R— Z, write x =n +r where n € Z and 0 <r < 1. (For example, ifx = 2.3,
we write 2.3 = 24 0.3, withn = 2 andr = 0.3; for x = —7.3 we have —7.3 = —8 + 0.7,
with n = —8 andr = 0.7.) Then
f(x)= [xf] +]=at+1 = [x] = ge).
Consequently, even though the functions f, g are defined by different formulas, we
realize that they are the same function — because they have the same domain and codomain
and f(x) = g(x) for all x in the domain R.
280 Chapter 5 Relations and Functions
Now that we have dispensed with the necessary preliminaries, it is time to examine an
operation for combining two appropriate functions.
Definition 5.18 If f: A— B and g: B > C, we define the composite function, which is denoted
gof:A—>C,by (go f)(a) = g(f(a)), for eacha € A.
Let A = {1, 2, 3, 4}, B = {a, b, c}, and C = {w, x, y, z} with f: A> Band g: BoC
EXAMPLE 5.53
given by f = {(1, a), (2, a), (3, 5), (4, c)} and g = {(a, x), (B, y), (c, z)}. For each ele-
ment of A we find:
(go f)C1) = g(f()) = gla) = x (go f)(3) = g(f(3))
= gh) = y
(go f)(2) = g(f(2)) = g(a) = x (go f)(4) = g(f(4) = gle) =z
So
gof ={(, x), (2, x), 3, y), (4, z)}.
Note: The composition f o g is not defined.
Let f: R > R, g:R > R be defined by f(x) = x”, g(x) =x +5. Then
EXAMPLE 5.54
(go f)(x) = g(f (x) = g@*) = x7 +5,
whereas
(f og)(x) = f(g(x)) = f(x +5) = xv +5)? = x7 4+ 10x 425.
Here go f: R— Rand f og:R—R, but (go f)(1) = 6 ¥ 36 = (f 0 g)(1), so even
though both composites f o g and go f can be formed, we do not have fog = go f.
Consequently, the composition of functions is not, in general, a commutative operation.
The definition and examples for composite functions required that the codomain of f =
domain of g. Ifrange of f C domain of g, this will actually be enough to yield the composite
function go f: A— C. Also, for any f: A > B, we observe that fol, = f = 1lgof.
An important recurring idea in mathematics is the investigation of whether combining
two entities with a common property yields a result with this property. For example, if A
and B are finite sets, then A % B and A U B are also finite. However, for infinite sets A and
B, we have A U B infinite but A M B could be finite.
For the composition of functions we have the following result.
THEOREM 5.5 Let f: A> Bandg: BOC.
a) If f and g are one-to-one, then g o f is one-to-one.
b) If f and g are onto, then g o f is onto.
Proof:
a) To prove that go f: A—C is one-to-one, let aj,a,.¢€ A with (go f)(a1) =
(go f){a2). Then (g 0 f)(ai) = (g 0 f) (a2) => B(f (ai) = gC f(a2)) = fla) =
f (a2), because g is one-to-one. Also, f(a;) = f (a2) > a, = a, because f is one-
to-one. Consequently, g o f is one-to-one.
5.6 Function Composition and Inverse Functions 281
b) For go f: A> C, let z € C. Since g is onto, there exists y € B with g(y) = z. With
f onto and y € B, there exists x € A with f(x) = y. Hence z = g(y) = g(f(x)) =
(g o f){x), so the range ofg o f = C = the codomain of g o f, and g o f is onto.
Although function composition is not commutative, if f: A— B, g: BC, and h:
C — D, what can we say about the functions (fh o g) o f andh o (g o f)? Specifically, is
(hog)o f =ho(go f)? That is, is function composition associative?
Before considering the general result, let us first investigate a particular example.
Let f, g, 4: R-> R, where f(x) = x”, g(x) =x 45, and h(x) = Vx? 42.
EXAMPLE 5.55_| Then (( 0 g) o f)(x) = (ho g)(f(x)) = (ho g)(x*) = h(g(x’)) = AQ? +5) =
JV (x2 +5)2 42 = Sx4 + 10x? + 27.
On the other hand, we see that (ho (go f))({x) =h((g 0 f)(x)) = h(g(f(x))) =
h(g(x7)) = A(x? +5) = J (x2 +5)? +2 = /x44 10x? 4 27, as above.
So in this particular example, (h o g) o f and ho (go f) are two functions with the
same domain and codomain, and for all x € R, ((hog)o f)(x) = Vx44 10x? + 27 =
(ho (go f))(x). Consequently, (ho g)o f =ho(go f).
We now find that the result in Example 5.55 is true in general.
THEOREM 5.6 If f: A> B,g: B>C,andh:
C > D, then (hog)o
f =ho(go f).
Proof: Since the two functions have the same domain, A, and codomain, D, the result
will follow by showing that for every x € A, ((ho g)o f)(x) = (ho (go f)){x). See the
diagram shown in Fig. 5.9.)
(hog)ef
he(gof)
Figure 5.9
Using the definition of the composite function we know that for each x € A it takes two
steps to determine (g o f)(x). First we find f(x), the image of x under f. This is an element
of B. Then we apply the function g to the element f(x) to determine g(f(x)), the image
of f(x) under g. This results in an element of C. At this point we apply the function h
to the element g(f (x)) to determine h(g(f(x))) = h((g o f)(x)) = (ho (g 0 f))(x). This
result is an element of D. Similarly, starting once again with x in A, we have f(x) in B,
282 Chapter 5 Relations and Functions
and now we apply the composite function # o g to f(x). This gives us ({h o g) o f)(x) =
(ho g)(f(x)) = h(g(f(x))).
Since ((h o g) o f)(x) = h(g(f(x))) = (ho (g o f)) (x), for each x in A, it now follows
that
(hog)of =ho(gof).
Consequently, the composition of functions is an associative operation.
By virtue of the associative property for function composition, we can write ho go f,
(hog)of or ho(go f) without any problem of ambiguity. In addition, this property
enables us to define powers of functions, where appropriate.
Definition 5.19 If f: A— Awe define f' = f,andforne
Zt, f"t! = f o(f”).
This definition is another example wherein the result is defined recursively. With f"*! =
f o(f"), we see the dependence of f"*! ona previous power, namely, f”.
WithA = {1, 2, 3, 4}and f: A > A defined
by f = {(1, 2), (2, 2), (3, 1), (4, 3)},
we have
EXAMPLE 5.56
fe =fof ={((, 2), (2,2), 3,2), (4, D} and fe =fof?=fofof ={i,2),
(2, 2), (3, 2), (4, 2)}. (What are f*, f°?)
We now come to the last new idea for this section: the existence of the invertible function
and some of its properties.
Definition 5.20 For sets A, B, if R is a relation from A to B, then the converse of R, denoted KR‘, is the
relation from B to A defined by R* = {(b, a)|(a, b) € R}.
To get R° from KR, we simply interchange the components of each ordered pair in
R. So if A = {1, 2,3, 4}, B = fw, x, y}, and R = {(1, w), (2, w), (3, x)}, then R° =
{(w, 1), (w, 2), (x, 3)}, a relation from B to A.
Since a function is a relation we can also form the converse of a function. For the
same preceding sets A, B, let f: A— B where f = {(1, w), (2, x), (3, y), (4, x)}. Then
f° = {(@w, 1), &, 2), (y, 3), (x, 4}, a relation, but not a function, from B to A. We wish to
investigate when the converse of a function yields a function, but before getting too abstract
let us consider the following example.
For A = {1, 2,3} and B = {w, x, y}, let f: A— B be given by f = {(1, w), (2, x),
EXAMPLE 5.57
(3, y)}. Then f° = {(w, 1), (x, 2), (y, 3)} is a function from B to A, and we observe that
f° of =1, and fo f* =1,.
This finite example leads us to the following definition.
Definition 5.21 If f: A— B, then f is said to be invertible if there is a function g: B — A such that
gof= 1, and fog= 1p.
5.6 Function Composition and Inverse Functions 283
Note that the function g in Definition 5.21 is also invertible.
Let f, g: R— R be defined by f(x) = 2x +5, g(x) = (1/2)(x — 5). Then (g 0 f)(x) =
EXAMPLE 5.58
g(f(x)) = g(2x +5) = (1/2) [(2x + 5) — 5] =x, and (f 0 g)(x) = f(g(x)) =
F(A/2)(x — 5)) = 2[0/2)( —5)] +5 =x,s0 fog =lpandgo f = Ir.
Consequently, f and g are both invertible functions.
Having seen some examples of invertible functions, we now wish to show that the
function g of Definition 5.21 is unique. Then we shall find the means to identify an invertible
function.
THEOREM 5.7 If a function f: A > B is invertible and a function g: B > A satisfies go f = 1,4 and
f og = 1g, then this function g is unique.
Proof: If g is not unique, then there is another function h: B > A with ho f = 1, and
f oh=1 8. Consequently, h =holp=ho(fog)=(ho f)og=140g8 =.
As a result of this theorem we shall call the function g the inverse of f and shall adopt
the notation g = f~!. Theorem 5.7 also implies that f~! = f°.
We also see that whenever f is an invertible function, so is the function f~', and
(f-')7! = f, again by the uniqueness in Theorem 5.7. But we still do not know what
conditions on f insure that f is invertible.
Before stating our next theorem we note that the invertible functions of Examples 5.57
and 5.58 are all bijective. Consequently, these examples provide some motivation for the
following result.
THEOREM 5.8 A function f: A > B is invertible if and only if it is one-to-one and onto.
Proof: Assuming that f: A — B is invertible, we have a unique function g: B > A with
gof=la, fog = 1g. Itai, az € A with f(a1) = f(a), then g(f(a))) = g(f(a2)), or
(g o f)(a,) = (g o f) (az). With g o f = 1, it follows that a; = a2, so f is one-to-one. For
the onto property, let b € B. Then g(b) € A, so wecan talk about f(g(b)). Sincef og = 1p,
we have b = 1g(b) = (f o g)(b) = f(g(b)), so f is onto.
Conversely, suppose f: A — B is bijective. Since f is onto, for each b € B there is an
a € A with f(a) = b. Consequently, we define the function g: B > A by g(b) = a, where
f(a) = b. This definition yields a unique function. The only problem that could arise is if
g(b) = ay F ay = g(b) because f(a,) = b = f (a2). However, this situation cannot arise
because f is one-to-one. Our definition of g is such that g o f = 14 and f og = 1g, so we
find that f is invertible, with g = f7!.
From Theorem 5.8 it follows that the function /;:R — R defined by f(x) = x? is not
EXAMPLE 5.59
invertible (it is neither one-to-one nor onto), but f2: [0, +00) > [0, +c) defined by
f(x) = x? is invertible with fy) = J/x.
The next result combines the ideas of function composition and inverse functions. The
proof is left to the reader.
284 Chapter 5 Relations and Functions
THEOREM 5.9 If f: A— B, g:B—C are invertible functions, then go f: A—C is invertible and
(gof)'=filog',
Having seen some examples of functions and their inverses, one might wonder whether
there is an algebraic method to determine the inverse of an invertible function. If the func-
tion is finite, we simply interchange the components of the given ordered pairs. But what if
the function is defined by a formula, as in Example 5.59? Fortunately, the algebraic manip-
ulations prove to be little more than a careful analysis of “interchanging the components of
the ordered pairs.” This is demonstrated in the following examples.
For m,b &€R, m # 0, the function f: R > R defined by f = {(x, y)|y = mx + 5} is an
EXAMPLE 5.60 invertible function, because it is one-to-one and onto.
To get f—! we note that
fo! ={, y)ly = mx + db} = {(y, ly = mx +d}
= {(x, y)|x =nn my + b} = {@, y)ly = (1/m)(@ — b)}.
This is where we rename the variables
(replacing x by y and y by x) in order to
change the components of the ordered pairs of f.
So f:R > R is defined by f(x) = mx +b, and f~':R => R is defined by f~!(x) =
(1/m)(x — b).
Let f:R— R* be defined by f(x) = e*, where e = 2.7183, the base for the natural
EXAMPLE 5.61
logarithm. From the graph in Fig. 5.10 we see that f is one-to-one and onto, so f7!:
R* — Rdoes exist and f—' = {(x, y)|y = e*}* = {(x, y)|x = & } = {(x, yy = In x}, so
f-'(x) =Inx.
yA
Figure 5.10
We should note that what happens in Fig. 5.10 happens in general. That is, the graphs
of f and f~! are symmetric about the line y = x. For example, the line segment connect-
ing the points (1, e) and (e, 1) would be bisected by the line y = x. This is true for any
corresponding pair of points (x, f(x)) and (f(x), f7'(f(x))).
5.6 Function Composition and Inverse Functions 285
This example also yields the following formulas:
x = Ip(x) = (f7! 0 f)(x) = Ine"), forallx ER.
x =Ipi(x) =(fof-')@) =e", forallx > 0.
Even when a function f: A > B is not invertible, we find use for the symbol f7! in the
following sense.
Definition 5.22 If f: A— Band B; CB, then f-!(B;) = {x € A| f(x) € By}. The set f~'(B)) is called
the preimage of B, under f.
Be careful! We are now using the symbol f~! in two different ways. Although we have
the concept of a preimage for any function, not every function has an inverse function.
Consequently, we cannot assume the existence of an inverse for a function f just because
we find the symbol f7! being used. A little caution is needed here.
Let A = {1, 2, 3, 4, 5, 6} and B = {6, 7, 8, 9, 10}. If f: A— B with f = (C1, 7), (2,7),
EXAMPLE 5.62 (3, 8), (4, 6), (5, 9), (6, 9)}, then the following results are obtained.
a) For B, = {6, 8} C B, we have f—!(B,) = {3, 4}, since f(3) = 8 and f(4) = 6, and
for anya € A, f(a) ¢ B, unlessa = 3 or a = 4. Here we also note that | f~'(B,)| =
2 = |B).
b) In the case of Bz = {7, 8} C B, since f(1) = f(2) = 7 and f (3) = 8, we find that the
preimage of B> under f is {1, 2, 3}. And here | f~!(B2)| = 3 > 2 = |Bo|.
c) Now consider the subset B3 = {8, 9} of B. For this case it follows that f~!(B3) =
{3, 5, 6} because f(3) = 8 and f(5) = f(6) = 9. Also we find that | f~'(B3)| = 3 >
2 = |Bs|.
d) If By = {8, 9, 10} C B, then with f (3) = 8 and f(5) = f(6) = 9, wehave f—!(B4) =
{3, 5, 6}. So fo! (Bs) = f~'(B3) even though By > B3. This result follows because
there is no elementa in the domain A where f(a) = 10—that is, f~'({10}) = @.
e) Finally, when Bs; = {8, 10} we find that f~!(Bs) = {3} since f(3) = 8 and, as in
part (d), f~!({10}) = @. In this case | f~!(Bs)| = 1 <2 = |Bs|.
Whenever f: A — B, then for each b € B we shall write f~!(b) instead of f~!({b}).
For the function in Example 5.62, we find that
f(6) = (4) fT) = (1,2) FB) = 3} 1) = {5,6} F110) = B.
EXAMPLE 5.63 Let f: R > R be defined by
| 3x5, x>0O
POO) ead, x <0.
a) Determine f(0), f(1), f(—l), f(5/3), and f(—5/3).
b) Find f~'(0), f-'(), f- 1-1), £71), f7'(-3), and f7!(-6).
c) What are f~'([—5, 5]) and f~'([—6, 5])?
286 Chapter 5 Relations and Functions
a) f(0) = -30)4+- 1=1 Ff (5/3) = 3(5/3) —5 =0
f() =3(1) -5 = -2 f (—5/3) = -—3(-5/3) + 1=6
f(-1) = -3(-I) +1=4
b) f~'(0) = {x ER| f(x) € {0}} = {x ERI f(x) = 0}
= {x € Rix > Oand3x —5 =O} U{x Ee RJIx <Oand — 3x + 1 =0}
= {x €E R|x > Oandx = 5/3}
U {x Ee R[x < O andx = 1/3}
= {5/3} UB = {5/3}
[Note how the horizontal line y = 0 — that is, the x-axis — intersects the graph in
Fig. 5.11 only at the point (5/3, 0).]
(10/3, 5)
(3, 4)
Figure 5.11
ff") = {x ERI f(a) € (1) = (x ERIF@) = 1
= {x €R|x > Oand3x —5 = 1} U {x eR|x <Oand —-3x4+1=1}
= {x € R[x > Oand x = 2} U {x ER|x <Oandx =0}
= {2} U {0} = {0, 2}
[Here we note how the dashed line y = | intersects the graph in Fig. 5.11 at the
points (0, 1) and (2, 1).]
f—'(-1)= {x E R|x > Oand 3x —5 = —1} U {x ER) x <Oand —-3x4+1=-—-1}
= {x € R|x > Oandx = 4/3} U {x E R|x <Oandx = 2/3}
= {4/3} US = {4/3}
f-1(3) = {-2/3, 8/3} f'(—3) = [2/3]
ic) = {x € R|x > Oand 3x —5 = —6} U {x ER| x <Oand — 3x41
= —6}
= {x €R|x > Oandx = —1/3}
U {x Ee R|x <Qandx = 7/3}
=SUB=4
ec) f-'([-5, 5]) = {xl f@) € [-5, 5]} = {x| —5
< fe) < 5}.
(Case 1) x > 0: —-5<3x —-5<5
0<3x <10
0<x < 10/3—so we use0 < x < 10/3.
5.6 Function Composition and Inverse Functions 287
(Case 2) x <Q: —~5<-3x+1<5
-6< —-3x <4
2>x > —4/3—here we use —4/3 < x <0.
Hence f~'((—5,5]) = {x|-—4/3 <x <0 or 0 <x < 10/3} = [—4/3, 10/3].
Since there are no points (x, y) on the graph (in Fig. 5.11) where y < —5, it follows
from our prior calculations that f~'({[—6, 5]) = f7'({[—5, 5]) = [—4/3, 10/3].
EXAMPLE 5.64 a) Let f: Z— Rbe defined by f(x) = x* + 5. Table 5.9 lists f~'(B) for various subsets
° B of the codomain R.
b) If g: R > R is defined by g(x) = x7 +5, the results in Table 5.10 show how a change
in domain (from Z to R) affects the preimages (in Table 5.9).
Table 5.9 Table 5.10
B f7(B) B g7'(B)
[6, 7] {-1, 1} (6, 7] [-V2, -1]U[1, v2]
[6, 10] {-2, -1, 1, 2} [6, 10] [-V5, -1JU[1, V5]
[-4, 5) b [—4, 5) gy
[—4, 5] {O} [—4, 5] {0}
[S, +00) Z [5, +00) R
The concept of a preimage appears in conjunction with the set operations of intersec-
tion, union, and complementation in our next result. The reader should note the difference
between part (a) of this theorem and part (b) of Theorem 5.2.
THEOREM 5.10 If f: A —> Band By, By C B, then (a) f~'(B, 0 By) = f7'(Bi) Nf" (Bo);
(b) f~'(By U By) = f7'(Bi) U f7'(Bo); and (c) f-'(B)) = f-!(B)).
Proof: We prove part (b) and leave parts (a) and (c) for the reader.
ForaeA,ace f(BUBR)S fMEB URS fla)e Bjor flaje bo ae
f-'(B)) ora € f7'(B:) ae fo'(B)) U f(D).
Using the notation of the preimage, we see that a function f: A > B is one-to-one if
and only if | f~!(b)| < 1 for each b € B.
Discrete mathematics is primarily concerned with finite sets, and the last result of this
section demonstrates how the property of finiteness can yield results that fail to be true in
general. In addition, it provides an application of the pigeonhole principle.
THEOREM 5.11 Let f: A — B for finite sets A and B, where |A| = | B|. Then the following statements are
equivalent: (a) f is one-to-one; (b) f is onto; and (c) f is invertible.
Proof: We have already shown in Theorem 5.8 that (c) = (a) and (b), and that together (a),
(b) => (c). Consequently, this theorem will follow when we show that for these conditions
288 Chapter 5 Relations and Functions
on A, B, (a) <> (b). Assuming (b), if f is not one-to-one, then there are elements ay, a2 €
A, with a; # a2, but f(a,) = f (a2). Then |A| > | f(A)| = | 8, contradicting |A| = |B].
Conversely, if f is not onto, then | f(A)| < |B|. With |A] = |B] we have |A| > | f(A)|, and
it follows from the pigeonhole principle that f is not one-to-one.
Using Theorem 5.11 we now verify the combinatorial identity introduced in Problem 6
at the start of this chapter. For if n € Z* and |A| = |B| =n, there are n! one-to-one
functions from A to B and )>;_9(—1)*(,",)(n — k)” onto functions from A to B. The
equality n! = )°;_9(—1)*(,,",)( — k)” is then the numerical equivalent of parts (a) and
(b) of Theorem 5.11. [This is also the reason why the diagonal elements S(n, n), 1 <n <8,
shown in Table 5.1 all equal 1.)
9, a) Find the inverse of the function f: R > R?* defined by
EXERCISES 5.6 f ( x) _ e2tts.
1. a) For A = (1, 2, 3, 4,..., 7}, how many bijective func-
b) Show that f o f~! = Ip+ andf—!o f = Ip.
tions f: A > A satisfy f(1) 4 1? 10. For each of the following functions f: R > R, determine
whether f is invertible, and, if so, determine f~'.
b) Answer part (a) where A = {x|x € Z*, 1 < x <n}, for
some fixed n € Z*. a) f = {(, y)|2x + 3y = 7}
2. a) For A = (—2, 7] C R define the functions b) f = {(, y)lax + by =c, b #0}
f,g: Az Rby ce) f = {(x, yly =x°)
2x? —8
f(x) =2x—-—4 and g(x)= a d) f ={@, yly = x7 +x}
11. Prove Theorem 5.9.
Verify that f = g.
b) Is the result in part (a) affected if we change A to 12. If A= (1, 2,3, 4,5, 6, 7}, B = {2, 4, 6, 8, 10, 12}, and
[—7, 2)? f:A—2B where f = {(1, 2), (2, 6), G3, 6), (4, 8), (5, 6),
(6, 8), (7, 12)}, determine the preimage of B, under f in
3. Let f, g: R> R, where g(x) = 1—x +x? and f(x) =
each of the following cases.
ax +b. If (go f)(x) = 9x? — 9x + 3, determine a, b.
a) By = {2} b) B; = {6}
4. Letg: N — N be defined by g(n) = 2n. IfA = {1, 2, 3, 4}
and f: A— N is given by f = {(1, 2), (2, 3), (3, 5), , 7)}, c) B; = {6, 8} d) B; = {6, 8, 10}
find go f. e) B, = {6, 8, 10, 12} f) B, = (10, 12)
5. If U is a given universe with (fixed) S$, 7 CU, define 13. Let f: R > R be defined by
a POU) > APCU by e(A) = TA (SUA) for A CU. Prove
that g? = g. x+7, x <0
6. Let f, g: R— Rwhere f(x) = ax + band g(x) =cx+d f(x)=% —2x +5, O<x <3
for allx € R, witha, b, c, d real constants, What relationship(s) x —1, 3<x
must be satisfied by a, b, c, dif (f o g)(x) = (g o f)(x) for all
x €R?
a) Find f~'(—10), f-'(0), f-'(4), f-'(6), f- 1), and
7. Let f, g, 4: Z— Zbe defined by f(x) =x - 1, f-'(8).
g(x) = 3x, b) Determine the preimage under f for each of the inter-
0, x even vals (i) [—5, —1}; (ii) [—5, 0]; Gi) [—2, 4]; Civ) (5, 10);
A(x) = and (v) [11, 17).
1, x odd.
14. Let f: R— R be defined by f(x) = x’. For each of the
Determine
(a) fog, gof, goh, hog, fo(goh),
following subsets B of R, find f~'(B).
(fog)oh; (b) f*, f°, 97, 8h, Wh.
a) B = {0, 1} b) B = {-1,0, 1}
8. Let f: A —> B, g: B > C. Prove that (a)if go f: A>C
is onto, then g is onto; and (b) if go f: A > C is one-to-one, c) B = [0, 1} d) B = (0,1)
then f is one-to-one. e) B = [0, 4] f) B= (0, 1) U (4, 9)
5.7 Computational Complexity 289
15. Let A = {1, 2,3, 4,5} and B = {6, 7, 8, 9, 10, 11, 12}. c) Is any one of the given functions invertible?
How many functions f: A — B are such that f~'({6, 7, 8}) = d) Are any of the following sets infinite?
{1, 2}?
(1) f-'@) (2) g 1D)
16. Let f: RR be defined by f(x) = Lx], the greatest (3) h'(B) (4) fap
integer in x. Find f~!(B) for each of the following subsets B (5) g"C2) (6) A7'({3})
of R. (7) f-'4, 7) (8) g '({8, 12})
a) B = {0, 1} b) B = {-1,
0, 1} (9) A7'({5, 9})
c) B =[0, }) d) B = [0,2) e) Determine the number of elements in each of the finite
sets in part (d).
e) B =[-1, 2] f) B =[-1,0)
Ud, 3]
19. Prove parts (a) and (c) of Theorem 5.10.
17. Let f, g: Z* + Z* where for all x € Z*, f(x) =x41
and g(x) = max{1, x — 1}, the maximum of | and x — 1. 20. a) Give an example of a function f: Z— Z where (i) f is
one-to-one but not onto; and (ii) f is onto but not one-to-
a) What is the range of f?
one.
b) Is f an onto function?
b) Do the examples in part (a) contradict Theorem 5.11?
c) Is the function f one-to-one?
21. Let f: Z— N be defined by
d) What is the range of g?
2x — 1, ifx >0
e) Is g an onto function? f(x)=
—2x, forx < 0.
f) Is the function g one-to-one?
a) Prove that f is one-to-one and onto.
g) Show thatgo f = lz+.
b) Determine f~!.
h) Determine (f o g)(x) forx = 2, 3, 4, 7, 12, and 25.
22. If |A| =|B| =5, how many functions f: A— B are
i) Do the answers for parts (b), (g), and (h) contradict the
invertible?
result in Theorem 5.8?
23. Let f, g,4,k: NN where f(n) = 3n, g(n) = [n/3],
18. Let f, g, h denote the following closed binary operations
h(n) = ((n + 1)/3), and k(w) = [(n + 2)/3], for eachn EN.
on P(Z*). For A, BCZ*, f(A, B)=ANB, g(A, B)=
(a) For each n EN what are (go f)(n), (Ao f)(n), and
AUB,h(A, B)=AAB.
(k o f)(n)? (b) Do the results in part (a) contradict Theo-
a) Are any of the functions one-to-one? rem 5.7?
b) Are any of f, g, and / onto functions?
5.7
Computational Complexity’
In Section 4.4 we introduced the concept of an algorithm, following the examples set forth
by the division algorithm (of Section 4.3) and the Euclidean algorithm (of Section 4.4). At
that time we were concerned with certain properties of a general algorithm:
@ The precision of the individual step-by-step instructions
e The input provided to the algorithm, and the output the algorithm then provides
e The ability of the algorithm to solve a certain type of problem, not just specific instances
of the problem
e The uniqueness of the intermediate and final results, based on the input
"The material in Sections 5.7 and 5.8 may be skipped at this point. It will not be used very much until Chapter
10. The only place where this material appears before Chapter 10 is in Example 7.13, but that example can be
omitted without any loss of continuity.
290 Chapter 5 Relations and Functions
e The finite nature of the algorithm in that it terminates after the execution of a finite
number of instructions
When an algorithm correctly solves a certain type of problem and satisfies these five
conditions, then we may find ourselves examining it further in the following ways.
1) Can we somehow measure how long it takes the algorithm to solve a problem of a
certain size? Whether we can may very well depend, for example, on the compiler
being used, so we want to develop a measure that doesn’t actually depend on such
considerations as compilers, execution speeds, or other characteristics of a given
computer.
For example, if we want to compute a" for a €R and n €Z", is there some
“function of 2” that can describe how fast a given algorithm for such exponentiation
accomplishes this?
2) Suppose we can answer questions such as the one set forth at the start of item 1. Then
if we have two (or more) algorithms that solve a given problem, is there perhaps a
way to determine whether one algorithm ts “better” than another?
In particular, suppose we consider the problem of determining whether a certain real
number x is present in the list of n real numbers a), a2, ... , d,. Here we have a problem
of size n.
If there is an algorithm that solves this problem, how long does it take to do so? To
measure this we seek a function f (n), called the time-complexity function’ of the algorithm.
We expect (both here and in general) that the value of f() will increase as increases.
Also, our major concern in dealing with any algorithm is how the algorithm performs for
large values of n.
In order to study what has now been described in a somewhat informal manner, we need
to introduce the following fundamental idea.
Definition 5.23 Let f, g: Zt > R. We say that g dominates f (or f is dominated by g) if there exist
constants m € R* and k € Z* such that | f(n)| < m|g(n)| for all n € Z*, where n > k.
Note that as we consider the values of f(1), g(1), f(2), g(2),..., there is a point
(namely, k) after which the size of f(n) is bounded above by a positive multiple (m) of
the size of g(n). Also, when g dominates f, then | f(n)/g(n)| < m [that is, the size of the
quotient f(#)/g(n) is bounded by m], for those n € Z* where n > k and g(n) # 0.
When f is dominated by g we say that f is of order (at most) g and we use what is
called “big-Oh”’ notation to designate this. We write f € O(g), where O(g) is read “order
g” or “big-Oh of g.” As suggested by the notation “f € O(g),” O(g) represents the set of
all functions with domain Z* and codomain R that are dominated by g. These ideas are
demonstrated in the following examples.
Let f, g:Z* > R be given by f(n) = 5n, g(n) =n’, for n € Z*. If we compute f(n)
EXAMPLE 5.65
and g(n) for 1 <n <4, we find that f(1) =5, g(1) = 1; f(2) = 10, g2)=4 f@B)=
‘We could also study the space-complexity function of an algorithm, which we need when we attempt to
measure the amount of memory required for the execution of an algorithm on a problem of size n. In this text,
however, we limit our study to the time-complexity function.
5.7 Computational Complexity 291
15, g(3) =9; and f(4) = 20, 2(4) = 16. However, n>5=3n?>5n, and we have
| f (2)| = 5n <n? = |g(n)|. So with m = 1 and k =5, we find that for n > k, | f()| <
m|g(n)|. Consequently, g dominates f and f € O(g). [Note that | f()/2(n)| is bounded
by | forall n > 5.]
We also realize that for all n € Z*, | f(n)| = 5n <5n? = 5|g(n)|. So the dominance off
by g is shown here with k = 1 and m = 5. This is enough to demonstrate that the constants
k and m of Definition 5.23 need nor be unique.
Furthermore, we can generalize this result if we now consider functions f,, ¢::Z* > R
defined by f\(”) = an, gi(n) = bn*, where a, b are nonzero real numbers. For ifm €¢ Rt
with m|b| > |a|, then for all n> 1(=k), |f,(@)| = |an| = |aln < m|b|n < m|b|n? =
m|bn*| = m|g;(n)|, and so f; € O(g)).
In Example 5.65 we observed that f € O(g). Taking a second look at the functions f
and g, we now want to show that g ¢ O(f).
Once again let f, g: Z* > R be defined by f(n) = 5n, g(n) = n*, forn € Zt.
EXAMPLE 5.66
If g € O(f), then in terms of quantifiers, we would have
dm e Rt ake Z* Vane Zt [n=k) = |g(n)| <m|f~)I].
Consequently, to show that g ¢ O( f), we need to verify that
Vn €R* WkeEZ* Ane Zt [(n=k) A (lg) > ml f()))I.
To accomplish this, we first should realize that m and k are arbitrary, so we have no control
over their values. The only number over which we have control is the positive integer n
that we select. Now no matter what the values of m and k happen to be, we can select
n € Z* such that n > max{5m, k}. Then n > k (actually n > k) andn > 5m > n? > 5mn,
so |g(n)| =n? > Smn = m|5n| = m| f(n)| andg ¢ O(f).
For those who prefer the method of proof by contradiction, we present a second approach.
Ifg € O(f), then we would have
n* = |g(n)| <m|f(n)| = mn
for all n > k, where k is some fixed positive integer and m is a (real) constant. But then
from n* < mn we deduce that n < m. This is impossible because n(€ Zt) is a variable that
can increase without bound while m is still a constant.
EXAMPLE 5.67 a) Let f,g:Z*—>R with f(n) =5n?4+3n4-1 and g(n)=n?. Then |f(n)| =
|Sn? + 3n + 1] = 5n? + 3n +1 <5n? + 3n? +n? = 9n? = 9]e(n)|. Hence for all
n>1 (=k), |f(@)| <mlg(n)| for any m>9, and f € O(g). We can also write
f € O(n’) in this case.
In addition, |g(n)| = n? < 5n? < 5n* +3n +1 =|f(n)| foralln > 1.S0|g(n)| <
m|f(n)| for any m > 1 and all n > k > 1. Consequently g € O(f). [In fact, O(g) =
O(f); that is, any function from Z* to R that is dominated by one of f, g is also
dominated by the other. We shall examine this result for the general case in the Section
Exercises.}
292 Chapter 5 Relations and Functions
b) Now consider f, g: Z* > R with f(n) = 3n? + 7n? — 4n +2 and g(n) = n°. Here
we have |f(a)| = |3n? + 7n? — 4n + 2| < |3n3| + [7n?| +] — 4n| + |2| < 303 +
Tn? + 4n3 4+ 2n? = 16n? = 16|g(n)|, for all n > 1. So with m = 16 and k = 1, we
find that f is dominated by g, and f € O(g), or f € O(n’).
Since 7n —4> 0 for all n> 1, we can write n> <3n3 <3n34+ (In —4)n +2
whenevern > 1. Then |g(n)| < | f(”)| for alln > 1,andg € O(f). [As in part (a), we
also have O(f) = O(g) = O(n) in this case.]
We generalize the results of Example 5.67 as follows. Let f: Z+ — R be the polynomial
function where f(n) = a,n' 4+-a,_-jn'~! +---+ aon* + ajn + ao, for a, a1, ..., a,
a,,a49 ER, a, #0, t € N. Then
|f(n)| = lan! +a,_yn'! +--+ 4 aon? +.ajn + aol
< Jayn'| + |a,_in'"| +--+ + |aon*| + lain| + lao|
= |a,|n' + Ja,_y|n'—} + = * «+ |a2|n* + |a\|n + |ao|
< |a;|n' + |ay_i|n’ +--+ + Jag|a’ + Jay|n’ + |ao|n'
= (lar| + lari] ++ ++ + la2| + lai] + laol)n’.
In Definition 5.23, let m = |a,;| + |a;-1| +--+ -+ |a@2| + lai] + |ao| and k = 1, and let
g: Zt >R be given by g(n) = n'. Then | f(n)| < ml|g(n)| for all n > k, so f is domi-
nated by g, or f € O(n’).
It is also true that g € O(f) and that O(f) = O(g) = O(n’).
This generalization provides the following special results on summations.
a) Letf: Z* > R be given by f(n) = 14+243-4----+n. Then (from Examples 1.40
EXAMPLE 5.68
and 4.1) f(n) = ($) (n)(n +1) = (4) n? + (3) 1, so f € On’).
b) If g:Z* > R with g(n) = 1° +27 4.37 4+.--- +n? = (2) (n+ 122 $+ 1) (from
Example 4.4), then g(n) = (4) n° + (4) n? + (z) nand g € O(n).
c) If t<¢Z*, and h: Zt >R is defined by h(n) = }0)_, i, then h(n) = 1°42) +
34. tni <n tn tni+..-4n' =n(n') =n'*! sohe O(n't),
Now that we have examined several examples of function dominance, we shall close this
section with two final observations. In the next section we shall apply the idea of function
dominance in the analysis of algorithms.
1) When dealing with the concept of function dominance, we seek the best (or tightest)
bound in the following sense. Suppose that f, g, h: Z* > R, where f € O(g) and
g € O(h). Then we also have f € O(A). (A proof for this is requested in the Section
Exercises.) If h ¢ O(g), however, the statement f € O(g) provides a “better” bound
on | f(#)| than the statement f € O(A). For example, if f(7) = 5, g(n) = 5n, and
h(n) =n’, for all n € Zt, then f € O(g), g € O(h), and f € O(h), buth ¢ O(g).
Therefore, we are provided with more information by the statement f € O(g) than
by the statement f € O(h).
2) Certain orders, such as O(n) and O(n”), often occur when we deal with function
dominance. Therefore they have come to be designated by special names. Some of
the most important of these orders are listed in Table 5.11.
5.7 Computational Complexity 293
Table 5.11
Big-Oh Form Name
Od) Constant
O (log, 7) Logarithmic
O(n) Linear
O(n log, n) n log, n
O(n’) Quadratic
O(n?) Cubic
O(n”), m=0, 1, 2,3,... Polynomial
O(c"), c>1 Exponential
O(n!) Factorial
(Hint:
him =
n> log, n
1. Use the results of Table 5.11 to determine the best “‘big-Oh”
form for each of the following functions f: Z* > R. This requires the use of calculus.)
a) f(n) =3n+7 b) f(x) = 3+ sin(1/n) 8. Let f, g, 4: Z* — Rwheref € O(g) andg € O(h). Prove
c) f(n) =n? — 5n? + 25n — 165 thatf € O(A).
d) f(n) = 5n* + 3n log, n 9. If g:Z* +R and ceR, we define the function cg:
e) fin) =n’? +(n- 1) Z* >R by (cg)(n) = c(g(n)), for each n € Z*. Prove that
if f, g:Z* > Rwith f € O(g), then f € O(cg) forallc ER,
n(n + 1)(n + 2) c #0.
f) fin) = 43)
g) fn) =24+4464---42n 10. a) Prove that f € O(f) forall f:Z* > R.
2. Let f, g: Z* — R, where f(n) = nand g(n) =n + (1/n), b) Let f, g:Z* > R. If f € O(g) and g € O(f), prove
for n € Z*. Use Definition 5.23 to show that f € O(g) and that O(f) = O(g). That is, prove that for all h: Z* > R,
if h is dominated by f, then A is dominated by g, and con-
g€ O(f).
versely.
3. Ineach of the following, f, g: Z* — R. Use Definition 5.23
c) Iff, g: Z* > R, prove that if O(f) = O(g), then f €
to show that g dominates f.
O(g) and g € O(Ff).
a) f(n) = 100 log, n, g(n) = (4) n
11. The following is analogous to the “big-Oh” notation intro-
b) f(n) = 2", g(n) = 27" — 1000 duced in conjunction with Definition 5.23.
c) f(n) = 3n?, g(n) = 2" +2n For f, g: Z* — R we say that f is of order at least g if there
4, Let f, g: Z* > R be defined by f(n) = n + 100, g(n) = exist constants M € R* andk € Z* such that | f(n)| > Mlg(n)|
n’. Use Definition 5.23 to show that f € O(g) but g ¢ O(f). forall n € Z*, where n > k. In this case we write f € Q(g) and
say that f is “big Omega of g.” So 22(g) represents the set of
5. Let f, g:Z* > R, where f(n) =n’? +n and g(n) = all functions with domain Z* and codomain R that dominate g.
(3) n’, forn € Z*. Use Definition 5.23 to show that f € O(g) Suppose that f, g,4:Z*t—R, where f(n) = 5n? + 3n,
but g ¢ O(f). e(n) =n’, h(n) =n, for all n € Z*. Prove that (a) f € 2(g);
(b) g € Q(Ff); (c) f € QCA); and (d) A ¢ Q( f) — that is, h is
6. Let f, g: Z* —> R be defined as follows:
not “big Omega of f.”
n, forn odd 1, for n odd
f(a) = g(n) = 12. Let f, g:Z* > R. Prove that f € Q(g) if and only if
1, fora even nh, for n even
ge O(f).
Verify thatf ¢ O(g) andg ¢ O(f).
13. a) Let f:Z* > R where f(n) = )0"_, i. When n = 4,
7. Let f, g: Z* > R where f(n) = n and g(n) = log, a, for for example, we have f(n) = f(4)=14+24+3+4+4>
n€ Z*. Show thatg € O(f) but f ¢ O(g). 24+34+4>2424+2=3-2=[(44 1)/2]2=6>
294 Chapter 5 Relations and Functions
(4/2)? = (n/2)?. For n=5, we find f(n) = f(5)= 14. For f, g:Z* > R, we say that f is “big Theta of g,” and
1424+344452>34445>3434+3=3-3= write f € @(g), when there exist constants m,, m2 € R* and
((5 + 1)/2]3 = 9 > (5/2)* = (n/2)?. In general, f(n) = k € Z* such that m,|e(n)| < | f()| < m2|g(n)|,for alln € Z*,
142+---+n>[n/2)+---+n> [n/2]+---4+ wheren > k. Prove that f € ©(g) if and only if f € Q(g) and
[n/2] = [(n + 1)/2] [n/2] > n?/4. f € O(g).
Consequently, f € Q(n’).
Use 15. Let f, g: Z* > R. Prove that
S _ a(n +4) f € O(g) if and only if g € O(f).
= 2 16. a) Let f: Z* — R where f(n) = >-"_, i. Prove that
to provide an alternative proof that f € Q(n’). f €Q(n’).
b) Let g:Z* > R where g(n) = 0", i2. Prove that b) Let g: Z* —> R where g(n) = }0"_, i”. Prove that
g € Qn). 2 € O(n).
c) For t€ Z*, let A: Zt > R where A(n) = yr , jt. c) For t€Z*, let h:Z* +R where A(n) = en i‘.
Prove that h € Q(n't!). Prove that h € O(n'*").
5.8
Analysis of Algorithms
Now that the reader has been introduced to the concept of function dominance, it is time to
see how this idea is used in the study of algorithms. In this section we present our algorithms
as pseudocode procedures. (We shall also present algorithms as lists of instructions. The
reader will find this to be the case in later chapters.)
We start with a procedure to determine the balance in a savings account.
In Fig. 5.12 we have a procedure (written in pseudocode) for computing the balance in
EXAMPLE 5.69 a savings account n months (for n € Z*) after it has been opened. (This balance is the
procedure’s output.) Here the user supplies the value of n, the input for the program. The
variables deposit, balance, and rate are real variables, while i is an integer variable. (The
annual interest rate is 0.06.)
procedure AccountBalance(n: integer)
begin
deposit := 50.00 The monthly deposit}
i:=l Initializes the counter}
rate :=0.005 The monthly interest rate}
balance := 100.00 Initializes the balance}
while i <ndo
begin
balance := deposit + balance + balance * rate
Z:2i41
end
end
Figure 5.12
Consider the following specific situation. Nathan puts $100.00 in a new account on
January 1. Each month the bank adds the interest (balance * rate) to Nathan’s account—
on the first of the month. In addition, Nathan deposits an additional $50.00 on the first of
5.8 Analysis of Algorithms 295
each month (starting on February 1). This program tells Nathan the balance in his account
after n months have gone by (assuming that the interest rate does not change). [Note: After
one month, n = 1 and the balance is $50.00 (new deposit) + $100.00 (initial deposit) +
($100.00)(0.005) (the interest) = $150.50. When n = 2 the new balance is $50.00 (new
deposit) + $150.50 (previous balance) + ($150.50) (0.005) (new interest) = $201.25.]
Our objective is to count (measure) the total number of operations (such as assignments,
additions, multiplications, and comparisons) this program segment takes to compute the
balance in Nathan’s account » months after he opened it. We shall let f (7) denote the total
number of these operations. [Then f: Z* > R. (Actually, f(ZT) ¢ Z*.)]
The program segment begins with four assignment statements, where the integer variable
i and the real variable balance are initialized, and the values of the real variables deposit
and rate are declared. Then the while loop is executed » times. Each execution of the loop
involves the following seven operations:
1) Comparing the present value of the counter i with n.
2) Increasing the present value of balance to deposit + balance + balance * rate; this
involves one multiplication, two additions, and one assignment.
3) Incrementing the value of the counter by 1; this involves one addition and one as-
signment.
Finally, there is one more comparison. This is made when i = n + 1, so the while loop is
terminated and the other six operations (in steps 2 and 3) are not performed.
Therefore, f(n) =4+7n+ 1=7n+5€ O(n). Consequently, we say that f € O(n).
For as n gets larger, the “order of magnitude” of 7n + 5 depends primarily on the value n, the
number of times the while loop is executed. Therefore, we could have obtained f € O(n)
by simply counting the number of times the while loop was executed. Such shortcuts will
be used in our calculations for the remaining examples.
Our next example introduces us to a situation where three types of complexity are
determined. These measures are called the best-case complexity, the worst-case complexity,
and the average-case complexity.
| EXAMPLE 5.70 In this example we examine a typical searching process. Here an array of n (> 1) integers
a, A, 43, ..., A, is to be searched for the presence of an integer called key. If the integer
is found, the value of location indicates its first location in the array; if it is not found the
value of location is 0, indicating an unsuccessful search.
We cannot assume that the entries in the array are in any particular order. (If they were,
the problem would be easier and a more efficient algorithm could be developed.) The input
for this algorithm consists of the array (which ts read in by the user or provided, perhaps,
as a file from an external source), along with the number 7 of elements in the array, and the
value of the integer key.
The algorithm is provided in the pseudocode procedure in Fig. 5.13.
We shall define the complexity function f() for this algorithm to be the number of
elements in the array that are examined until the value key is found (for the first time) or
the array is exhausted (that is, the number of times the while loop is executed).
What is the best thing that can happen in our search for key? If key = a1, we find that key
is the first entry of the array, and we had to compare key with only one element of the array.
In this case we have f(n) = 1, and we say that the best-case complexity for our algorithm
296 Chapter 5 Relations and Functions
procedure LinearSearch(key, n: integer; a), a,@3,...,a,: integers)
begin
i:=1 {initializes the counter}
while (i <nand key # a,) do
T:=i4¢+l1
if i<nthen location :=i {successful search}
else location :=0 {unsuccessful search}
end {Jocation is the subscript of the first array entry that equals key;
location is 0 if key is not found}
Figure 5.13
is O(1) (that is, it is constant and independent of the size of the array). Unfortunately, we
cannot expect such a situation to occur very often.
From the best situation we turn now to the worst. We see that we have to examine all
n entries of the array if (1) the first occurrence of key is a, or (2) key is not found in the
array. In either case we have f(n) = n, and the worst-case complexity here is O(n). (The
worst-case complexity will typically be considered throughout the text.)
Finally, we wish to obtain an estimate of the average number of array entries examined.
We shall assume that the 7 entries of the array are distinct and are all equally likely (with
probability p) to contain the value key, and that the probability that key is not in the array
1s equal to g. Consequently, we have np + q = 1 and p = (1 — q)/n.
For each 1 <i <n, if key equals a;, then i elements of the array have been examined. If
key is not in the array, then all » array elements are examined. Therefore, the average-case
complexity is determined by the average number of array elements examined, which is
f(n)=(-p+2-p+3-ptes-+n-
p)tn-g= pl+2+34+---4+n) +ng
_ pr(n +1)
a ae + nq
Ifg = 0, then key is in the array, p = 1/n and f(n) = (n + 1)/2 € O(n). Forg = 1/2, we
have an even chance that key is in the array and f(n) = (1/(2n))[n(n + 1)/2] + (2/2) =
(n+ 1)/44+ (7/2) € O(n). [In general, for all 0 < g < 1, we have f(n) € O(n).]
The result in Example 5.70 for the average number of array elements examined in the linear
EXAMPLE 5.71° search algorithm may also be calculated using the idea of the random variable. When the
algorithm is applied to the array a), a2, a3, ... , a, (ofn distinct integers), we let the discrete
random variable X count the number of array elements examined in the search for the integer
key. Here the sample space can be considered as {1, 2, 3,...,,*}, where for 1 <i <n,
we have the case where key is found to be a; — so that the i elements a, a2, a3, ..., G;
have been examined. The entry n* denotes the situation where all n elements are examined
but key is not found among any of the array elements a), a2, a3, ... , Gy.
Once again we assume that each array entry has the same probability p of containing
the value key and that g is the probability that key is not in the array. Then np + g = 1 and
"This example uses the concept of the discrete random variable which was introduced in the optional material
in Section 3.7. It may be skipped without loss of continuity.
5.8 Analysis of Algorithms 297
we have Pr(X =i) = p, for 1 <i <n, and Pr(X = n*) = q. Consequently, the average
number of array elements examined during the execution of the linear search algorithm is
E(X)= s iPr(X =i) +nPr(X =n"*)
i=l
pnin + 1) + ng.
=) iptnp = plt+2434---4+n)+ng = 5
i=]
Early in the discussion of the previous section, we mentioned how we might want to
compare two algorithms that both correctly solve a given type of problem. Such a compar-
ison can be accomplished by using the time-complexity functions for the algorithms. We
demonstrate this in the next two examples.
The algorithm implemented in the pseudocode procedure of Fig. 5.14 outputs the value of a”
| EXAMPLE 5.72 for the input a, n, where a is areal number and » is a positive integer. The real variable x is
initialized as 1.0 and then used to store the values a, a”, a*, ... , a” during execution of the
for loop. Here we define the time-complexity function f() for the algorithm as the number
of multiplications that occur in the for loop. Consequently, we have f(n) =n € O(n).
procedure Poweri(a: real; n: positive integer)
begin
X:=1.0
for i1:=1tondo
X:=x*a
end
Figure 5.14
In Fig. 5.15 we have a second pseudocode procedure for evaluating a” for all ae R,
EXAMPLE 5.73
n€Z*. Recall that |i /2| is the greatest integer in (or the floor of ) i/2.
procedure Power2(a: real; n: positive integer)
begin
xX :=1.0
i:+#n
while i > 0 do
begin
ifif#2*|i/2|then {iis odd}
xX :=xX*@
i:=|i/2}
if i>0Othen
a:=ata
end
end
Figure 5.15
298 Chapter 5 Relations and Functions
For this procedure the real variable x is initialized as 1.0 and then used to store the
appropriate powers of a until it contains the value of a”. The results shown in Fig. 5.16
demonstrate what is happening to x (and a) for the cases where n = 7 and 8. The numbers 1,
2, 3, and 4 indicate the first, second, third, and fourth times the statements in the while loop
(in particular, the statement i := [i/2]) are executed. If n = 7, then because 272<7 <2,
we have 2 < log, 7 < 3. Here the while loop is executed three times and
3 = [log, 7] +1 <log,7+1,
where |log, 7| denotes the greatest integer in log, 7, which is 2. Also, when n = 8, the
number of times the while loop is executed is
4 = |log, 8] + 1 = log, 8+ 1,
since log, 8 = 3.
n=7 n=8
xX:=1.0 xX:=1.0
L:=7 1:=8
X:=x*a {x= a} ifr ie’
ifs: a:i=a*a
fin?==]
a@:=a*ta 3
1i=
X:=x*a {x= a} =
afd ze=l
a@:=a*a a
afe ix {x = a’) x:=x*a {x= a}
1:=0 4 i1:=0
[x=al=a-a’- ai] [x= (((a)*)?)7]
Figure 5.16
We shall define the time-complexity function g(n) for (the implementation of) this
exponentiation algorithm as the number of times the while loop is executed. This is
also the number of times the statement i := [i/2]| is executed. (Here we assume that
the time interval for the computation of each |i/2] is independent of the magnitude
of i.) On the basis of the foregoing two observations, we want to establish that for all
n> 1, g(n) <log,n +1 © O(log, 1). We shall establish this by the Principle of Mathe-
matical Induction (the alternative form— Theorem 4.2) on the value of n.
When n = 1, we see in Fig. 5.15 that i is odd, x is assigned the value of a = a', and
a‘ is determined after only 1 = log, | +1 execution of the while loop. So g(1) =1<
log, 1+ 1.
Now assume that for all 1 <n <k, g(n) < log) n + 1. Then for n = k + 1, during the
k+1
first pass through the while loop the value of 7 is changed to +} Since 1 <
k+1 k+1
< k, by the induction hypothesis we shall execute the while loop g (|“=*})
2
. k+1 k+1
more times, where g => < log, =z +1.
5.8 Analysis of Algorithms 299
Therefore
k+1 k+1
ck +1) <14 (tog, [AS *] +1] c1+[toe. (“S) +1
= 1+ [log,(k + 1) — log, 2+ 1] = log,(K +1) +1.
For the time-complexity function of Example 5.72, we found that f (7) € O(n). Here we
have g{n) € O(log, n). It can be verified that g is dominated by f but f is not dominated
by g. Therefore, for large n, this second algorithm is considered more efficient than the first
algorithm (of Example 5.72). (However, note how much easier the pseudocode in Fig. 5.14
is than that of the procedure in Fig. 5.15.)
In closing this section, we shall summarize what we have learned by making the following
observations.
1) The results we established in Examples 5.69, 5.70, 5.72, and 5.73 are useful when
we are dealing with moderate to large values of n. For small values of n, such con-
siderations about time-complexity functions have little purpose.
2) Suppose that algorithms A, and Az have time-complexity functions f(n) and g(n),
respectively, where f(n) € O(n) and g(n) € O(n”). We must be cautious here. We
might expect an algorithm with linear complexity to be “perhaps more efficient” than
one with quadratic complexity. But we really need more information. If f(n) = 1000”
and g(n) =n’, then algorithm A? is fine until the problem size n exceeds 1000. If
the problem size is such that we never exceed 1000, then algorithm A> is the better
choice. However, as we mentioned in observation 1, as n grows larger, the algorithm
of linear complexity becomes the better alternative.
3) In Fig. 5.17 we have graphed a log-linear plot for the functions associated with some
of the orders given in Table 5.11. [Here we have replaced the (discrete) integer variable
n by the (continuous) real variable n.} This should help us to develop some feeling
for their relative growth rates (especially for large values of 7).
F(n) *
= log n
Figure 5.17
300 Chapter 5 Relations and Functions
The data in Table 5.12 provide estimates of the running times of algorithms for certain
orders of complexity. Here we have the problem sizes n = 2, 16, and 64, and we assume
that the computer can perform one operation every 107° second = 1 microsecond (on
the average). The entries in the table then estimate the running times in microseconds.
For example, when the problem size is 16 and the order of complexity is n log, n, then
the running time is a very brief 16 log, 16 = 16-4 = 64 microseconds; for the order of
complexity 2”, the running time is 6.5 X 10* microseconds = 0.065 seconds. Since both of
these time intervals are so short, it is difficult for a human to observe much of a difference
in execution times. Results appear to be instantaneous in either case.
Table 5.12
Order of Complexity
Problem sizen | log, n n nlog,n n 2" n!
2 1 2 2 4 4 2
16 4 16 64 256 =. 6.5 &: 104 2.1 x 10%
64 6 64 384 4096 1.84 x 10!9 > 1089
However, such estimates can grow rather rapidly. For instance, suppose we run a program
for which the input is an array A of n different integers. The results from this program are
generated in two parts:
1) First the program implements an algorithm that determines the subsets of A of
size 1. There are n such subsets.
2) Then a second algorithm is implemented to determine all the subsets of A. There are
2” such subsets.
Let us assume that we have a computer that can determine each subset of A in a mi-
crosecond. For the case where |A| = 64, the first part of the output is executed almost
instantaneously — in approximately 64 microseconds. For the second part, however, Table
5.12 indicates that the amount of time needed to determine all the subsets of A will be about
1.84 < 10!° microseconds. We cannot be too content with this result, however, since
1.84 x 10° microseconds = 2.14 x 10° days = 5845 centuries.
eee b) beg:
)Pegin
fori :=1tondo
1. In each of the following pseudocode program segments,
for j :=lton*ndo
the integer variables i, j, 1, and sum are declared earlier in the
Sum := sum+1
program. The value of n (a positive integer) is supplied by the
user prior to execution of the segment. In each case we define end
the time-complexity function f(#) to be the number of times ¢) begin
the statement sum := sum + 1 is executed. Determine the best sum := 0;
“big-Oh” form for f. for i :=1tondo
a) begin for j :=itondo
sum :=0 sum := sum+1
end
for i:=1tondo
for j :=1tondo d) begin
sum := sum+1 sum := 0
end i:=n
5.8 Analysis of Algorithms 301
while i > 0 do 8 — 10x + 7x? — 2x3 + 3x7 4 12x°,
begin when x is replaced by an arbitrary (but fixed) real number r.
Sum := sum+1 For this particular instance, n = 5 and ap = 8, a; = —10,
i:= [i/2] @ = 7, a; = —2, ag = 3, andas = 12.
end
end procedure PolynomialEvaluationl
e) begin (nm: nonnegative integer;
r,a,@,@,-.-, a: real)
sum := 0
for i:=l1tondo begin
begin product :=1.0
jian value := ap
while j > 0 do fori:=1tondo
begin begin
Sum := Sum+1
product := product * r
value := value +a, * product
j := (79/2!
end end
end end
end
a) How many additions take place in the evaluation of
2. The following pseudocode procedure implements an al- the given polynomial? (Do not include the n — 1 additions
gorithm for determining the maximum value in an array needed to increment the loop variable 7.) How many mul-
@, 42, 43,..., 4, Of integers. Here n > 2 and the entries in tiplications?
the array need not be distinct. b) Answer the questions in part (a) for the general polyno-
procedure Maximum (n: integer; mial
al, 42,a3,...,a,: integers) 2 3 -|
ay + a,x
+ ax” + agx? +--+
+ Gy yx" + a,x",
begin
Max i= aj where dy, @1, 42, 43, .-.» Gn—1, An are real numbers and n
for i :=2tondo is a positive integer.
if a, > max then 6. We first note how the polynomial in the previous exercise
max := a, can be written in the nested multiplication method:
end
8+x(-104+ x(7 +.x(—-24
«(3 4 12x)))).
a) If the worst-case complexity function f (7) for this seg-
ment is determined by the number of times the comparison Using this representation, the following pseudocode procedure
a, > max is executed, find the appropriate “big-Oh” form (implementing Horner’s method) can be used to evaluate the
for f. given polynomial.
b) What can we say about the best-case and average-case procedure PolynomialEvaluation2
complexities for this implementation? (nm: nonnegative integer;
3. a) Write a computer program (or develop an algorithm) to LC, ay, a1, a2,-+.+-,
an: real)
locate the first occurrence of the maximum value in an array begin
a), 2, 43,..., a, of integers. (Heren € Z* and the entries value
:= a,
in the array need not be distinct.) forj :=n- 1 downto
0 do
value :=a,+r* value
b) Determine the worst-case complexity function for the
end
implementation developed in part (a).
4, a) Write a computer program (or develop an algorithm) to Answer the questions in parts (a) and (b) of Exercise 5 for the
determine the minimum and maximum values in an array new procedure given here.
a}, 42, 43,...,@, of integers. (Here n € Z* with n > 2, 7, Let a,, a2, a3, ... be the integer sequence defined recur-
and the entries in the array need not be distinct.) sively by
b) Determine the worst-case complexity function for the
1) a, = 0; and
implementation developed in part (a).
2) Fora > 1, @, = 1+ @jn/2).
5. The following pseudocode procedure can be used to eval-
uate the polynomial Prove that a, = [log, nj for alln € Z*.
302 Chapter 5 Relations and Functions
8. Let a, a2, a3, ... be the integer sequence defined recur- suppose the probability that key has the value a, isi/[n(m + 1)],
sively by for | <i <n. Under these circumstances, what is the average
number of array elements examined?
1) a, = 0; and
11. a) Write a computer program (or develop an algorithm)
2) Forn > 1, a, = 14+ jn).
to determine the location of the first entry in an array
Find an explicit formula for a, and prove that your formula is a1, 42, 43,..-, 4, of integers that repeats a previous en-
correct. try in the array.
9. Suppose the probability that the integer key is in the array b) Determine the worst-case complexity for the imple-
a), 42, 43,..., A, (ofn distinct integers) is 3/4 and that each mentation developed in part (a).
array element has the same probability of containing this value. 12. a) Write a computer program (or develop an algorithm)
If the linear search algorithm of Example 5.70 is applied to this to determine the location of the first entry @, in an array
array and value of key, what is the average number of array a|, 42, 43, ..., d, of integers, where a, < a,_}.
elements that are examined?
b) Determine the worst-case complexity for the imple-
10. When the linear search algorithm is applied to the array mentation developed in part (a).
Q\, Q,43,..., a, (of n distinct integers) for the integer key,
5.9
Summary and Historical Review
In this chapter we developed the function concept, which is of great importance in all areas
of mathematics. Although we were primarily concerned with finite functions, the definition
applies equally well to infinite sets and includes the functions of trigonometry and calculus.
However, we did emphasize the role of a finite function when we transformed a finite set
into a finite set. In this setting, computer output (that terminates) can be thought of as a
function of computer input, and a compiler can be regarded as a function that transforms a
(source) program into a set of machine-language instructions (object program).
The actual word function, in its Latin form, was introduced in 1694 by Gottfried Wil-
helm Leibniz (1646-1716) to denote a quantity associated with a curve (such as the slope
of the curve or the coordinates of a point of the curve). By 1718, under the direction of
Johann Bernoulli (1667-1748), a function was regarded as an algebraic expression made
up of constants and a variable. Equations or formulas involving constants and variables
ee
i ze
% 4,
Gottfried Wilhelm Leibniz (1646-1716)
5.9 Summary and Historical Review 303
came later with Leonhard Euler (1707-1783). His is the definition of “function” generally
found in high school mathematics. Also, in about 1734, we find in the work of Euler and
Alexis Clairaut (1713-1765) the notation f(x), which is still in use today.
Euler’s idea remained intact until the time of Jean Baptiste Joseph Fourier (1768-1830),
who found the need for a more general type of function in his investigation of trigonometric
series. In 1837, Peter Gustav Lejeune Dirichlet (1805-1859) set down a more rigorous
formulation of the concepts of variable, function, and the correspondence between the
independent variable x and the dependent variable y, when y = f(x). Dirichlet’s work
emphasized the relationship between two sets of numbers and did not call for the existence
of a formula or expression connecting the two sets. With the developments in set theory
during the nineteenth and twentieth centuries came the generalization of the function as a
particular type of relation.
Peter Gustav Lejeune Dirichlet (1805-1859)
In addition to his fundamental work on the definition of a function, Dirichlet was also
quite active in applied mathematics and in number theory, where he found need for, and
was the first to formally state, the pigeonhole principle. Consequently, this principle is
sometimes referred to as the Dirichlet drawer principle or the Dirichlet box principle.
The nineteenth and twentieth centuries saw the use of the special function, one-to-one
correspondence, in the study of the infinite. In about 1888, Richard Dedekind (1831-1916)
defined an infinite set as one that can be placed into a one-to-one correspondence with a
proper subset of itself. [Galileo (1564-1642) had observed this for the set Z*.] Two infinite
sets that could be placed in a one-to-one correspondence with each other were said to have
the same transfinite cardinal number. In a series of articles, Georg Cantor (1845-1918)
developed the idea of levels of infinity and showed that |Z| = |Q| but |Z| < |R|. A set A
with |A| = |Z| is called countable, or denumerable, and we write |Z| = No as Cantor did,
using the Hebrew letter aleph, with the subscripted 0, to denote the first level of infinity. To
show that |Z| < |R|, or that the real numbers were uncountable, Cantor devised a technique
now referred to as the Cantor diagonal method. (More about the theory of countable and
uncountable sets can be found in Appendix 3.)
The Stirling numbers of the second kind (in Section 5.3) are named in honor of James
Stirling (1692-1770), a pioneer in the development of generating functions, a topic we
will investigate later in the text. These numbers appear in his work Methodus Differentialis,
published in London in 1730. Stirling was an associate of Sir Isaac Newton (1642-1727) and
304 Chapter 5 Relations and Functions
was using the Maclaurin series in his work 25 years before Colin Maclaurin (1698-1746).
However, although his name is not attached to this series, it appears in the approximation
known as Stirling’s formula: n! = (27n)!/2e-"n", which, as justice would have it, was
actually developed by Abraham DeMoivre (1667-1754).
Using the counting principles developed in Section 5.3, the results in Table 5.13 extend
the ideas that were summarized in Table 1.11. Here we count the number of ways it is
possible to distribute m objects into n containers, under the conditions prescribed in the
first three columns of the table. (The cases wherein neither the objects nor the containers
are distinct will be covered in Chapter 9.)
Table 5.13
Objects | Containers Some Number
Are Are Container(s) of
Distinct Distinct May Be Empty Distributions
Yes Yes Yes n™
Yes Yes No n! S(m, n)
Yes No Yes S(m, 1) + S(m, 2) +---+
S(m, n)
Yes No No S(m, n)
No Yes Yes ("tr t)
m
No Yes No n+(m—n)— 1 = m—I
(m — n) m—n
_{m— 1
n—-1]
Finally, the “big-Oh” notation of Section 5.7 was introduced by Paul Gustav Heinrich
Bachmann (1837-1920) in his book Analytische Zahlentheorie, an important work on num-
ber theory, published in 1892. This notation has become prominent in approximation theory,
in such areas as numerical analysis and the analysis of algorithms. In general, the notation
f € O(g) denotes that we do not know the function f explicitly but do know an upper
bound on its order of magnitude. The “big-Oh” symbol is sometimes referred to as the Lan-
dau symbol, in honor of Edmund Landau (1877-1938), who used this notation throughout
his work.
Further properties of the Stirling numbers of the second kind are given in Chapter 4
of D. I. A. Cohen [3] and in Chapter 6 of the text by R. L. Graham, D. E. Knuth, and
O. Patashnik [7]. The article by D. J. Velleman and G. S. Call [11] provides a very interesting
introduction to the Stirling numbers of the second kind —as well as the Eulerian numbers
introduced in Example 4.21. For more on infinite sets and the work of Georg Cantor,
consult Chapter 8 of H. Eves and C. V. Newsom [6] or Chapter IV of R. L. Wilder [12].
The presentation in the book by J. W. Dauben [5] covers the controversy surrounding set
theory at the turn of the century and shows how certain aspects of Cantor’s personal life
played such an integral part in his understanding and defense of set theory.
More examples that demonstrate how to apply the pigeonhole principle are given in
the articles by K. R. Rebman [9] and A. Soifer and E. Lozansky [10]. Further results and
Supplementary Exercises 305
extensions on problems arising from the principle are covered in the article by D. S. Clark
and J. T. Lewis [2]. During the twentieth century a great deal of research has been de-
voted to generalizations of the pigeonhole principle, culminating in the subject of Ramsey
theory, named for Frank Plumpton Ramsey (1903-1930). An interesting introduction to
Ramsey theory can be found in Chapter 5 of D. I. A. Cohen [3]. The text by R. L. Graham,
B. L. Rothschild, and J. H. Spencer [8] provides further worthwhile information.
Extensive coverage on the topic of relational data bases can be found in the work of
C. J. Date [4]. Finally, the text by S. Baase and A. Van Gelder [1] is an excellent place to
continue the study of the analysis of algorithms.
REFERENCES
{. Baase, Sara, and Van Gelder, Allen. Computer Algorithms: Introduction to Design & Analysis,
3rd ed. Reading, Mass.: Addison-Wesley, 2000.
2. Clark, Dean S., and Lewis, James T. “Herbert and the dungarian Mathematician: Avoiding
Certain Subsequence Sums.” The College Mathematics Journal 2) (March 1990): pp. 100-
104.
3. Cohen, Daniel I. A. Basic Techniques of Combinatorial Theory. New York: Wiley, 1978.
4. Date, C. J. An Introduction to Database Systems, 7th ed. Boston, Mass.: Addison-Wesley,
2002.
5. Dauben, Joseph Warren. Georg Cantor: His Mathematics and Philosophy of the Infinite.
Lawrenceville, N. J.: Princeton University Press, 1990.
6. Eves, Howard, and Newsom, Carroll V. An Introduction to the Foundations and Fundamental
Concepts of Mathematics, rev. ed. New York: Holt, 1965.
7. Graham, Ronald L., Knuth, Donald E., and Patashnik, Oren. Concrete Mathematics, 2nd ed.
Reading, Mass.: Addison-Wesley, 1994.
8. Graham, Ronald L., Rothschild, Bruce L., and Spencer, Joel H. Ramsey Theory, 2nd ed. New
York: Wiley, 1980.
9. Rebman, Kenneth R. “The Pigeonhole Principle (What it is, how it works, and how it applies
to map coloring).” The Two-Year College Mathematics Journal, vol. 10, no. 1 (January 1979):
pp. 3-13.
10. Soifer, Alexander, and Lozansky, Edward, “Pigeons in Every Pigeonhole.” Quantum (January
1990): pp. 25-26, 32.
11. Velleman, Daniel J., and Call, Gregory S. “Permutations and Combination Locks.” Mathemat-
ics Magazine 68 (October 1995): pp. 243-253.
12. Wilder, Raymond L. Introduction to the Foundations of Mathematics, 2nd ed. New York:
Wiley, 1965.
b) If f: A — B is a one-to-one correspondence and A, B
SUPPLEMENTARY EXERCISES are finite, then A = B.
c) If f: A— B is one-to-one, then f is invertible.
1. Let A, B ©. Prove that d) If f: A > B is invertible, then f is one-to-one.
a) (A X B)N (BX A) = (ANB) X (ANB): and e) If f: A— B is one-to-one and g,4:B—>C with
gof =ho f,theng =h.
b) (A X B)U(B
X A) C(AUB) X (AUB).
f) If f:A—> B and A,, ACA, then f(A; M A2) =
2. Determine whether each of the following statements
1s true F(A) A f (Az).
or false. For each false statement give a counterexample. g) If f: A> B and By, By CB, then f7'(B)N By) =
a) If f: A— B and (a, b), (a,c) € f, thenb =c. f-' (By) 0 f 7! (Bo).
306 Chapter 5 Relations and Functions
3. Let f: RR where f(ab)=af(b)+ bf(a), for all the chores be assigned if Thomas, as the eldest, must mow the
a,b ER. (a) What is f(1)? (b) What is f(0)? (c) Ifne Zt, lawn (one of the ten weekly chores) and no one is allowed to
a €R, prove that f(a") = na" f(a). be idle?
4. Let A, B CN with 1 < |A| < |B|. If there are 262,144 re- 17. Letn €N, n > 2. Show that S(n, 2) = 2”-! — 1.
lations from A to B, determine all possibilities for | A| and |B. 18. Mrs. Blasi has five sons (Michael, Rick, David, Kenneth,
5. If U,, UW, are universal sets with A, BC U,, and C, DC and Donald) who enjoy reading books about sports. With Christ-
Us, prove that (AM B) X (CN D) = (A X C)N(B X D). mas approaching, she visits a bookstore where she finds 12 dif-
6. Let A = {1, 2,3, 4,5} and B= (1, 2, 3,4, 5, 6}. How ferent books on sports.
many one-to-one functions f: A —> B satisfy (a) f(1) = 3? a) In how many ways can she select nine of these books?
(b) fC) = 3, f(2) = 6? b) Having made her purchase, in how many ways can she
7. Determine all real numbers x for which distribute the books among her sons so that each of them
gets at least one book?
x? — |x| = 1/2.
¢) Two of the nine books Mrs. Blasi purchased deal with
8. Let R C Z* X Z* be the relation given by the following basketball, Donald’s favorite sport. In how many ways can
recursive definition. she distribute the books among her sons so that Donald gets
at least the two books on basketball?
1) (1, 1) ER; and
19, Let m,n €Z* with n>. (a) In how many ways can
2) For all (a, b) € KR, the three ordered pairs (a + 1, b),
one distribute n distinct objects among m different contain-
(a+1,6+ 1), and (a+ 1, b+ 2) are also
in R.
ers with no container left empty? (b) In the expansion of
Prove that 2a > b for all (a, by E R. (x; t x2 +---+-,,)", what is the sum of all the multino-
9. Let a, b denote fixed real numbers and suppose that f/f: mial coefficients (sens. atm) wheren; +2 +---+n,, = nand
R > R is defined by f(x) = a(x + b) — b, x ER. (a) Deter- n; > Ofor all 1 <i <m?
mine f?(x) and f(x). (b) Conjecture a formula for f"(x), 20. Ifn € Z* withn > 4, verify that S(n, n — 2) = (3) + 3(4).
where n € Z*. Now establish the validity of your conjecture. 21. If f: A— A, prove that for all m,ne Z*, f"o f" =
10. Let A;, A and B be sets with {1,2,3,4,5} =A, CA, fo f™. (First let m = 1 and induct on n. Then induct on m.
B=({s,t,u,v,w,x}, and f: A, > B. If f can be extended This technique is known as double induction.)
to A in 216 ways, what is |A|? 22. Let f: X > Y, and for eachi € /, let A, C X. Prove that
11. Let A = {1, 2, 3,4, S} and B = {t, u,v, w, x, y, z}. (@ If a) f (rer A,) = ies f(A,).
afunction f: A > B is randomly generated, what is the prob-
b) f (Nhe A,) € ey f(A,).
ability that it is one-to-one? (b) Write a computer program (or
develop an algorithm) to generate random functions f: A > B c) f (Mer A;) =(ic, f(A,), for f one-to-one.
and have the program print out how many functions it generates 23. Given a nonempty set A, let f: A— A and g:A—>A
until it generates one that is one-to-one. where
12. Let S be a set of seven positive integers the maximum of
which is at most 24. Prove that the sums of the elements in all f(a)
= g(f(f@)) and g(a) = f(e(f@))
the nonempty subsets of S cannot be distinct.
for all a in A. Prove that f = g.
13. In a ten-day period Ms. Rosatone typed 84 letters to differ-
ent clients. She typed 12 of these letters on the first day, seven 24. Let A be a set with |A| = 7.
on the second day, and three on the ninth day, and she finished a) How many closed binary operations are there on A?
the last eight on the tenth day. Show that for a period of three b) A closed ternary (3-ary) operation on A is a function
consecutive days Ms. Rosatone typed at least 25 letters. f: AX AX A-— A. How many closed ternary operations
14. If {x;, x, ..., x7} C Z*, show that for somei # , either are there on A?
x, +x, or x, — x, 1s divisible by 10. c) A closed k-ary operation on A is a function f: A; X
15. Letn € Z*, n odd. If i), iz, ..., i, is a permutation of the Az X++: X Ay > A, where
A, = A, forall 1 <1< k.How
integers 1, 2,...,, prove that (1 — i;)(2 — 12) +++ (n —i,) is many closed k-ary operations are there on A?
an even integer. (Which counting principle is at work here?) d) Aclosed k-ary operation for A is called commutative if
16. With both of their parents working, Thomas, Stuart, and
Craig must handle ten weekly chores among themselves. (a) In f(a, a@2,.. "5 ay) = f (r(ay), (a2), none » a(ax)),
how many ways can they divide up the work so that everyone
is responsible for at least one chore? (b) In how many ways can where 4, d2,...,@,€A (repetitions allowed), and
Supplementary Exercises 307
(a\), 7(a2),..., (ay) is any rearrangement of c) Determine 2~*, 2? 3 x" (n>2), 07, 073,
a, @2,..., @x. How many of the closed k-ary operations a-"(n>2), where, for example, a7* 2 = a7 log-l=
on A are commutative? (a oa@)~' = (a”)7'. (See Supplementary Exercise 30.)
25. a) Let S = {2, 16, 128, 1024, 8192, 65536}. If four num- 32. Forn € Z*, define r: Z* — Z* by t(n) = the number of
bers are selected from S, prove that two of them must have positive-integer divisors of 7.
the product 131072. py€2 pS...
a) Let n = pj)’ _€3
pi, where pi, po, p3,-.-. Pr
b) Generalize the result in part (a). are distinct primes and e, is a positive integer for all
26. If Wis a universe and A C , we define the characteristic 1<i<k. Whatis t(n)?
function of A by x4: U — {0, 1}, where b) Determine the three smallest values of n € Z* for which
t(n) = k, where k = 2, 3, 4, 5, 6.
1, xeA
Xa) = ¢) For all k € Z*, k > 1, prove that t~!(k) is infinite.
0, xGA
d) If a,be€Z* with gcd(a, b) = 1, prove that t(ab) =
For sets A, B CU, prove each of the following:
t(a)t(b).
a) Xane = Xa * Xa, where (Xa + Xe)(%) = Xa(X) + XB) 33. a) How many subsets A = {a,b,c,d}CZ*, where
b) Xaus = Xa + XB — Xan a, b, c,d > 1, satisfy the propertya-b-c-d=
©) xx =1—- Xa, where (1 — Xa)X) = 1) — xa) = 2-3-5-7-31-13-17- 19?
1 ~ xa(x) b) How many © subsets A= {@), @,.. 1, Am} CZ,
mt
(For %U finite, placing the elements of “U in a fixed order re- where a, > 1, 1 <i <™m, satisfy the property Tt, a; =
sults in a One-to-one correspondence between subsets A of U [L-:
n
p,, where the p,, | < j <n, are distinct primes and
. . . *
and the arrays of 0’s and 1’s obtained as the images of “U under n>m?
Xa. These arrays can then be used for the computer storage and
34, Give anexample of a function f: Z* — R where f € O(1)
manipulation of certain subsets of WU.)
and f is one-to-one. (Hence / is not constant.)
27. With A = {x, y, z}, let f, g: A> A be given by f =
35. Let f, g: Z* — R where
{(, y), 2), (2, XE, 8 = {O, ¥), CY, x), (, Z}- Determine
each of the following: fog, gof, f-',2', @of), 2, form even 3, for n even
flog ,andg !of7'. rn =
g(n) =
fin) 1, forn odd 4, for n odd
28. a) If f: R > Ris defined by f(x) = 5x +3, find f~'(8).
b) Ifg: R > R, where g(x) = Ix? + 3x + II, find g~'(1). Prove or disprove each of the following: (a) f € O(g); and
c) For 4: R > R, given by (b) g € O(f).
36. For f, g: Z* > R we define f + g:Z* > R by
x
h(x) = | (f + g)(n) = f(n) + g(n), for n € Z*. [Note: The plus sign
x +2)’
in f + g is for the addition of the functions f and g, while
find h~' (4). the plus sign in f(n) + g() is for the addition of the real num-
29. If A= {1, 2, 3, ..., 10}, how many functions f: A—> A bers f(n) and g(n).]
(simultaneously) satisfy f—'({1, 2, 3) =@, f-'({4, 5}) = a) Let f\, 2::Z* > R with f € O(/;) andg € O(g)). If
(1,3, 7}, and f-!({8, 10}) = {8, 10}? fi(n) = 0, g)(n) = 0, for all n € Z*, prove that (f + g) €
30. Let f: A > A be an invertible function. For 2 € Z* prove Of; + a1).
that (f”)~' = (f7')". [This result can be used to define f~" as b) If the conditions f;(n) > 0, g,(n) > 0, for alln € Zt,
either (f”)~! or (f7')".] are not satisfied, as in part (a), provide a counterexample to
31. In certain programming languages, the functions pred and show that
succ (for predecessor and successor, respectively) are functions
from Z to Z where pred(x) = w(x) = x — 1 and suce(x) = fe Of), 8 € Og) ACS +9) € OCF) + g1).
o(x)=x4+1. 37. Let a,b € R*, with a, b> 1. Let f, g:Z* > R be de-
a) Determine (7 00)(x), (9 o7)(x). fined by f(n) = log, n, g(n) = log,n. Prove that f € O(g)
b) Determine 77, 73, 2"(n > 2), 0%, 03, o"(n > 2). and g € O(f). [Hence O(log, n) = O(log, 7).]
Languages: Finite
State Machines
I: this era of computers and telecommunications, we find ourselves confronted every day
with input-output situations. For example, in purchasing a package of chewing gum from
a vending machine, we input some coins and then press a button to get our expected output,
the package of chewing gum we desire. The first coin that we input sets the machine in
motion. Although we usually don’t care about what happens inside the machine (unless
some kind of breakdown occurs and we suffer a loss), we should realize that somehow the
machine is keeping track of the coins we insert, until the correct total has been inserted.
Only then, and not before, does the vending machine output the desired package of chewing
gum. Consequently, for the vendor to make the expected profit per package of chewing gum,
the machine must internally remember, as each coin is inserted, what sum of money has
been deposited.
Acomputer is another example of an input-output device. Here the input is generally some
type of information, and the output is the result obtained after processing this information.
How the input is processed depends on the internal workings of the computer; it must
have the ability to remember past information as it works on the information it is currently
processing.
Using the concepts we developed earlier on sets and functions, in this chapter we shall
investigate an abstract model called a finite state machine, or sequential circuit. Such circuits
are one of two basic types of control circuits found in digital computers. (The other type is
a combinational circuit or gating network, which is examined in Chapter 15.) They are also
found in other systems such as our vending machine, as well as in the controls for elevators
and in traffic-light systems.
As the name indicates, a finite state machine has a finite number of internal states where
the machine remembers certain information when it is in a particular state. However, before
getting into this concept we need some set-theoretic material in order to talk about what
constitutes valid input for such a machine.
6.1
Language: The Set Theory of Strings
Sequences of symbols, or characters, play a key role in the processing of information by a
computer. Inasmuch as computer programs are representable in terms of finite sequences
of characters, some algebraic way is needed for handling such finite sequences, or strings.
Throughout this section we use & to denote a nonempty finite set of symbols, collectively
called an alphabet. For example, we may have © = {0, 1} or © = {a, b, , d, e}.
309
310 Chapter 6 Languages: Finite State Machines
In any alphabet &, we do not list elements that can be formed from other elements
of & by juxtaposition (that is, if a, b € X, then the string ab is the juxtaposition of the
symbols a and b). As a result of this convention, alphabets such as & = {0, 1, 2, 11, 12}
and & = {a, b, c, ba, aa} are not considered. (In addition, this convention will help us later
in Definition 6.5, when we talk about the length of a string.)
Using an alphabet © as the starting place, we can construct strings from the symbols of
x in a systematic manner by using the following idea.
Definition 6.1 If © is an alphabet and n € Z*, we define the powers of & recursively as follows:
1) x! = X; and
2) ="t! = {xy|x € E, y € E"}, where xy denotes the juxtaposition ofx and y.
| EXAMPLE 6.1 Let & be an alphabet.
If n =2, then ©? = {xy|x, ye E}. For instance, with © = {0,1} we find b? =
{00, O1, 10, 11}.
When n = 3, the elements of ©? have the formuv, whereu € © andv € ¥2. Butsince we
know the form of the elements in £7, we may also regard the strings in £7 as sequences of the
form uxy, where u, x, y € X&. As an example for this case, suppose that & = {a, b, c, d, e}.
Then &? would contain 5° = 125 three-symbol strings — among them aaa, ach, ace, cdd,
and eda.
In general, for alln € Z* we find that |="| = |=|" because we are dealing with arrange-
ments (of size 2) where we are allowed to repeat any of the |X| objects.
Now that we have examined &” for € Z*, we shall look into one more power of Z.
Definition 6.2 For an alphabet © we define £° = {A}, where A denotes the empty string —that is, the
string consisting of no symbols taken from &.
The symbol A is never an element in our alphabet ©, and we should not mistake it for
the blank (space) that is found in many alphabets.
However, although A ¢ &, we do have # C X&, so we need to be cautious here. We observe
that (1) {A} Z X sinceA ¢ E; and (2) {A} # B because |{A}| = 1 4 0 = |@I.
In order to speak collectively about the sets £°, ©!, £7, ... , we introduce the following
notation for unions of such sets.
Definition 6.3 If © is an alphabet, then
a) Xt = Ue”, "= U er D"; and b) X* = Ue ~”.
We see that the only difference between the sets &* and X* is the presence of the element
d because A € D” only whenn = 0. Also D* = Et U L?,
In addition to using the term string, we shall also refer to the elements of &* or X* as
words and sometimes as sentences. For & = {0, 1, 2}, we find such words as 0, 01, 102,
and 1112 in both ©* and =*.
Finally, we note that even though the sets &* and =* are infinite, the elements of these
sets are finite strings of symbols.
6.1 Language: The Set Theory of Strings 311
For & = {0, 1} the set &* consists of all finite strings of 0’s and 1’s together with the
EXAMPLE 6.2
empty string. For n reasonably small, we could actually list all strings in }”.
If & = {B,0,1,2,...,9, +, —, x, /, ()}, where 6 denotes the blank (or space), it is
harder to describe &* and, for n > 2, there are too many strings to list in ©”. Here in &*
we find familiar arithmetic expressions such as (7 + 5)/(2 X (3 — 10)) as well as gibberish
such as +)((7/X + 3/(.
We are now confronted with a familiar situation. As with statements (Chapter 2), sets
(Chapter 3), and functions (Chapter 5), once again we need to be able to decide when two
objects under study —-in this case strings —-are to be considered the same. We investigate
this issue next.
Definition 6.4 If w1, wo € Xt, then we may write wy = x1xX2 +++ Xm and w2 = yy2--+ yn, form,n eZ,
and x1, X2,..., Xin, Yl, Y2,+++, ¥n © LX. We Say that the strings w, and w2 are equal, and
we write w) = w2, ifm =n, and x; = y; forall 1 <i<m.
It follows from this definition that two strings in &* are equal only when each is formed
from the same number of symbols from © and the corresponding symbols in the two strings
match identically.
The number of symbols in a string is also needed to define another property.
Definition 6.5 Let w = x,X2-++X, € E*, where x; € © for each 1 <i <n. We define the length of w,
which is denoted by ||w’||, as the value n. For the case of 2, we have ||A|| = 0.
As a result of Definition 6.5, we find that for any alphabet &, if w € =* and ||w|| > 1,
then w € £*, and conversely. Also, for all y € X*, ||y|| = 1 if and only if y e ©. Should
x contain the symbol £ (for the blank), it is still the case that |||] = 1.
If we use a particular alphabet, say & = {0, 1, 2}, and examine the elementsx = 01, y =
212, and z = 01212 (in &*), we find that
[ZI] = ]01212|| =5 = 243 = {Ol + ]212|] = lx] + lly].
In order to continue our study of the properties of strings and alphabets, we need to
extend the idea of juxtaposition a little further.
Definition 6.6 Let x, ye Xt with x = x4x.-++ xX, and y = y;ya-+-+ yx, SO that each x;, for 1 <i <m,
and each y;, for 1 < j <n, is in &. The concatenation of x and y, which we write as xy,
is the string x1.X2- + XmYiLYo°°° Yn.
The concatenation ofx and is xA = x1 X2+- + XmA = X1X2 +--+ Xm = X, and the concate-
nation ofA and x 18 Ax = AxyxX2 +++ Xm = X1X2 +++ Xm = X. Finally, the concatenation of xr
and A is AA = 2X.
Here we have defined a closed binary operation on &* (and £1t). This operation is
associative but not commutative (unless |&| = 1), and since xA = Ax = x for all x € X*,
the element A € &”* is the identity for the operation of concatenation. The ideas embodied
312 Chapter 6 Languages: Finite State Machines
in the last two definitions (the length of a string and the operation of concatenation) are
interrelated in the result
llxyll = Ilxll+lyll, for allx, ye =",
from which we obtain the special case
Ill] = [Le] 4 0 = [fx]] + [Al] = [eal] (= [ax |).
Finally, for each z € ©, we have ||z|| = ||zA|| = ||Az|| = 1, whereas ||zz|| = 2.
The closed binary operation of concatenation now leads us to another recursive definition.
Earlier we looked at powers of an alphabet &. Now we examine powers of strings.
Definition 6.7 For each x € &*, we define the powers ofx by x=) x) =x, x7 = xx, x8 = xx7,...,
x'tl = yx"... wheren EN.
This definition is another illustration of how a mathematical entity is given in a recur-
sive manner: The mathematical entity we presently seek is derived from previously derived
entities. The definition provides a way for us to deal with the n-fold concatenation [an
(n + 1)st power] of a string as the concatenation of the string with its (7 — 1)-fold concate-
nation (an nth power). In so doing, the definition includes the special case where the string
is just one symbol.
If X& = {0, 1} and x = O01, then x® =A, x! = O01, x? = 0101, and x? = 010101. For all
EXAMPLE 6.3
n > Q, x” consists of a string of n 0’s and n 1’s where the first symbol is 0 and the sym-
bols alternate. Here ||x7|| = 4 = 2]|x||, \|x?]] = 6 = 3||x||, and, forall n EN, ||x"|| = allx|.
We are just about ready to tackle the major theme of this section, the concept of a
language. Before we do so, however, we need to inquire about three other ideas. These
ideas involve special subsections of strings.
Definition 6.8 If x, y € X&* and w = xy, then the string x is called a prefix of w, and if y # A, then x is
said to be a proper prefix. Similarly, the string y is called a suffix of w; it is a proper suffix
when x # A,
Let © = {a, b, c}, and consider the string w = abbcc. Then each of the strings i, a, ab,
EXAMPLE 6.4
abb, abbc, and abbcc is a prefix of w, and except for abbcc itself, each is a proper prefix.
On the other hand, each of the strings A, c, cc, bec, bbcc, and abbcc is a suffix of w, where
the first five strings are proper suffixes.
In general, for an alphabet X, if n € Z* and x; € XZ, for all 1 <i <n, then each of
A, X1, X1X2, X1X2X3,..., and xyx2xX3-- + Xy, is a prefix of the stringx = xyx2x3 +--+ X,. And
A, Xn» Xn—-1Xn, Xn—-2Xn—-1Xn,-.., aNd x,xX2xX3 +++ xX, are all suffixes of x. So x has n+ 1
prefixes, x of which are proper —and the situation is the same for suffixes.
If ||x|| = 5, | yl] =4, and w = xy, then w has x as a proper prefix and y as a proper suffix.
EXAMPLE 6.5 In all, w has nine proper prefixes and nine proper suffixes because A is both a proper prefix
and a proper suffix for every string in =*+. Here xy is both a prefix and a suffix, but in
neither case is it proper.
6.1 Language: The Set Theory of Strings 313
For a given alphabet ©, let w, a, b, c, d € &*. lf w = ab = cd, then
EXAMPLE 6.6
1) a isa prefix of c, orc is a prefix of a; and
2) bisa suffix of d, or d is a suffix of b.
Definition 6.9 If x, y, z € &* and w = xyz, then y 1s called a substring of w. When at least one ofx and
z is different from A (so that y is different from w), we call y a proper substring.
For & = {0, 1}, let w = 00101110 € &*. We find the following substrings in w:
EXAMPLE 6.7
1) 1011: This arises in only one way
— when w = xyz, with x = 00, y = 1011, and
z= 10.
2) 10: This example comes about in two ways:
a) w = xyz wherex = 00, y = 10, andz = 1110; and
b) w = xyz forx = 001011, y = 10, andz = 4.
In case (b) the substring is also a (proper) suffix of w.
Now that we are familiar with the necessary definitions, it is time to think about the
concept of language. When we consider the standard alphabet, including the blank, many
strings such as qxio, the wxxy red atzl, and aeyt! do not represent words or parts of sentences
that appear in the English language, even though they are elements of &*. Consequently, in
order to consider only those words and expressions that make sense in the English language,
we concentrate on a subset of &*. This leads us to the following generalization.
Definition 6.10 For a given alphabet ©, any subset of &* is called a language over X. This includes the
subset @, which we call the empty language.
With & = {0, 1}, the sets A = {0, 01, 001} and B = (0, 01, 001, 0001, .. .} are examples
EXAMPLE 6.8 of languages over =.
With & the alphabet of 26 letters, 10 digits, and the special symbols used in a given imple-
EXAMPLE 6.9 mentation of C++, the collection of executable programs for that implementation constitutes
a language. In the same situation, each executable program could be considered a language,
as could a particular set of such programs.
Since languages are sets, we can form the union, intersection, and symmetric difference
of two languages. However, for the work here, an extension of the closed binary operation
defined (in Definition 6.6) for strings is more useful.
Definition 6.11 For an alphabet © and languages A, B C &*, the concatenation of A and B, denoted AB,
is {abla € A, bE B}.
314 Chapter 6 Languages: Finite State Machines
We might compare concatenation with the cross product. We shall see that just as
A X B # BX A in general, we also have AB # BA in general. For A, B finite we did
have |A X B| = |B X A|, but here |AB| # |BA| is possible for finite languages.
Let & = {x, y, z}, and let A, B be the finite languages A = {x, xy, z}, B = {A, y}. Then
EXAM :
XAMPLE 6.10 AB = {x, xy, z, xyy, zy} and BA = {x, xy, z, yx, yxy, yz}, so
1) |AB| = 5 #6 =|BA|; and
2) |AB| =5 #6=3-2=|A||Bl.
The differences arise because there are two ways to represent xy: (1) xy forx € A, ye B
and (2) xyA where xy € A and A € B. [The concept of uniqueness of representation is
something we cannot take for granted. Although it does not hold here, it is a key to the
success of many mathematical ideas. We saw this, for example, in the Fundamental Theorem
of Arithmetic (Theorem 4.11).]
The preceding example suggests that for finite languages A and B,|AB| < |A|| B|. This
can be shown to be true in general.
The following theorem deals with some of the properties satisfied by the concatenation
of languages.
THEOREM 6.1 For an alphabet X, let A, B, C C ©*. Then
a) A[A}={AJA= A b) (AB)C = A(BC)
c) A(BUC) = ABUAC d) (BUC)A=BAUCA
e) A(BNC)C ABN AC f) (BNC)ACBANCA
Proof: We prove parts (d) and (f) and leave the other parts for the reader.
(d) Since we are trying to prove that two sets are equal, once again we use the idea of
set equality that we first found in Definition 3.2. Starting with x in &* we find that
xE€(BUC)A>x = yzforye BUCandz€ A> (x = yzfory€ B,z € A)or(x = yz
for ye C,zEe€A)SxXxEBA or x ECA SxEBAUCA, so (BUC)AC BAUCA.
Conversely, it follows thatx € BAUCAS>xeBAorx € CAS (x = ba, wherebe B
and a, € A) or (x = cay where c €C and a) € A). Assume x = ba, for be B, a; € A.
Since B C BUC, we have x = ba,, where b € B UC anda, € A. Thenx € (BUC)A, so
BAUCA C(BUC)A. (The argument is similar ifx = ca2.) With both inclusions estab-
lished, it follows that (B UC)A = BAUCA.
(f) For x € &*, we see that x € (BN C)A =x = yz where ye BNC and ze A>
(x = yzfor y € Bandz € A) and (x = yzforyeCandze A) xe BAandx € CAS
x €BANCA, so(BNC)AC BANCA.
With & = {x, y, z}, let B = {x, xx, y}, C = {y, xy}, and A = {y, yy}. Then xyyeé
BANCA, but xyy ¢(BNC)A. Consequently, (B 1 C)A C BAN CA for these partic-
ular languages.
Comparable to the concepts of 5", X*, &*, the following definitions are given for an
arbitrary language A C &*.
6.1 Language: The Set Theory of Strings 315
Definition 6.12 For a given language A C &* we can construct other languages as follows:
a) A® = {A}, A! = A, and for alln € Z*, A"*! = {abla € A, b € A").
b) At = U ez A", the positive closure of A.
c) A* = At U {A}. The language A” is called the Kleene closure of A, in honor of the
American logician Stephen Cole Kleene (1909-1994).
If & = {x, y, z} and A = {x}, then (1) A® = {A}; (2) A” = {x"}, for eachn € N; (3) At =
EXAMPLE 6.11 {x"|n > 1}; and (4) A* = {x"|n > QO}.
EXAMPLE 6.12. | ‘t™ = 9)
a) If A = {xx, xy, yx, yy} = £2, then A* is the language of all strings w in X* where
the length of w is even.
b) With A as in part (a) and B = {x, y}, the language B A* contains all the strings in X*
of odd length. In this case we also find that BA* = A*B and that &* = A* U BA*.
c) The language {x}{x, y}* (the concatenation of the languages {x} and {x, y}*) contains
every string in ©* for which x is a prefix. The language {x}{x, y}7 (the concatenation
of the languages {x} and {xy}*) contains every string in &* for which x is a proper
prefix.
The language containing all strings in £* for which yy is a suffix can be defined
by {x, y}*{yy}.
Every string in the language {x, y}*{xxy}{x, y}* has xxy as a substring.
d) Each string in the language {x}*{y}* consists of a finite number (possibly zero) of
x’s followed by a finite number (also possibly zero) of y’s. And although {x}*{y}* €
{x, y}*, the string w = xyx isin{x, y}* but notin {x}*{y}*. Hence {x}*{y}* Cc {x, y}*.
In the algebra of real numbers, if a, b € R and a, b > 0, then a” = b* +a = b. However,
EXAMPLE 6.13 | in the case of languages, if © = {x, y}, A = {A, x, x°, x4,...} = fx"|n > O} — {x7} and
B = {x"|n > 0}, then A? = B?(= B), but A # B. (Note: We never have 4 € &, but it is
possible to have 4 € A C E*.)
We continue this section with a lemma and a second theorem that deal with the properties
of languages.
LEMMA 6.1 Let & be an alphabet, with languages A, B C X*. If A C B, then for alln € Z*, A" CB".
Proof: Since A! = AC B = B’, it follows that the result is true in the case for n = 1.
Assuming the truth form = k, we have AC B => A* c B*. Now consider a string x from
A*+t! From part (a) of Definition 6.12 we know thatx = x,x,,wherex; € A, x, € AX. IFAC
B then A‘ C BF (by the induction hypothesis), and we have x; € B, x, € B*. Consequently,
x = xx, € BB* = B**! and A**! ¢ B*!, By the Principle of Mathematical Induction, it
now follows that if A C B, then for all n € Z*, A” C B".
316 Chapter 6 Languages: Finite State Machines
Note: Lemma 6.1 does not establish that A* C B* or that A* C B*. These results are
part of our next theorem.
THEOREM 6.2 For an alphabet © and languages A, B C D*,
a) AC AB* b) AC B*A
ec) ACBSAt+CcBt d) AC B= A* Cc B*
e) AA* = A*A = At f) A*A* = A* = (A*)* = (A*)t = (At)
g) (AU B)* = (A*U BY)" = (ATB*Y*
Proof: We provide the proofs for parts (c) and (g).
(c) Let AC Bandx € A*. Thenx € At 3x € A", forsomen € Z*. From Lemma 6.1
it then follows that x ¢ B" C Bt, and we have shown that A* C Bt.
(g) [(A U B)* = (A* U B*)*]. We know that A C A*, B C B* 3 (AU B) C (A*U B*)
= (AU B)* Cc (A* U B*)* [by part (d)]. Conversely, we also see that A, BC AUBS>
A*, B* C (AU B)* [by part (d)] => (A* U B*) C (AU B)* = (A* U B*)* C (A U B)* [by
parts (d) and (f)]. From both inclusions it follows that (A U B)* = (A* U B*)*.
[(A* U B*)* = (A* B*)*]. First we find that A*, B* C A*B*|by parts (a) and (b)] >
(A* U B*) C A* B* = (A* U B*)* C (A*B*)* [by part (d)]. Conversely, if xy € A*B*
where x € A* and ye B*, then x, ye A*U B*, so xy € (A*U B*)*, and A*B*C
(A* U B*)*. Using parts (d) and (f) again, (A*B*)* C (A* U B*)*, and so the result
follows.
As we close this first section we further examine the idea of a recursively defined set
(given in Section 4.2), as demonstrated in the following three examples.
For the alphabet & = {0, 1} consider the language A C &* where each word in A contains
EXAMPLE 6.14
exactly one occurrence of the symbol 0. Then A is an infinite set, and among the words in
A one finds 0, 01, 10, 01111, 11110111, and 11111111110. There are also infinitely many
words in &* that are not in A—-such as, 1, 11, 00, 000, 010, and 011111111110. We can
define this language A recursively as follows:
1) Our base step tells us that 0 € A; and
2) For the recursive process we want to include in A the words 1x and x1, for each word
xeA,
Using this definition, the following discussion shows us that the word 1011 is in A.
From part (1) of our definition, we know that 0 € A. Then by applying part (2) of our
definition — three times — we find:
i) 01 <¢ A, because 0 € A;
ii) O11 € A, because 01 € A; and
iii) Since O11 € A, we have 1011 € A.
For & = {(,)}—the alphabet containing the left and right parentheses — we want to con-
EXAMPLE 6.15
sider the language A € &* consisting of those nonempty strings of parentheses that are
grammatically correct for algebraic expressions. Hence we find, for example, the three
strings (( )), ((( ) ())), and (.) (.) () in this language, but we do not find strings such as
(()€), 0) (C), or )C (C ))). We see that ifa string x(# A) is to be in A, then
6.1 Language: The Set Theory of Strings 317
i) we must have the same number of left parentheses in x as there are right parentheses;
and
it) the number of left parentheses must (always) be greater than or equal to the number
of right parentheses, as we examine each of the parentheses in x —reading them
consecutively from left to right.
The language A may be given recursively as follows:
1) () isin A; and
2) For all x, y € A we have (i) xy € A and (ii) (x) € A.
[As we mentioned prior to Example 4.22, we also have an implicit restriction here — that
no string of parentheses is in A unless it can be derived through steps (1) and (2) above.]
Using this recursive definition, the following shows us how to establish that the string
({ )()) in &* is in the language A.
Steps Reasons
1) ()isin A. Part (1) of the recursive definition
2) ()() isin A. Step (1) and part (21) of the definition
3) (( { )) isin A. Step (2) and part (211) of the definition
Given an alphabet &, consider the string x = x,x2x3 +++ X,—)X, in &* where x; € X for
EXAMPLE 6.16
each 1 <i <n and neZ". The reversal of x, denoted x*, is the string obtained from
x by reading the symbols (in x) from right to left— that is, x® = x,x,~1 «+» x3x2X1. For
example, if © = {0, 1} and x = 01101, then x* = 10110 and for w = 101101 we find
w® = 101101 = w. In general, we can define the reversal of a string (from £*) recursively
as follows:
1) A® =); and
2) Foreachn € N,ifx € ©"*!, then wecan writex = zy wherez € © and y € ©” —and
here we define x* = (zy)* = (y¥)z.
Using this recursive definition we shall now prove that if © is an alphabet and x, x2 € D*,
then (x}.x2)* = xR xk,
Proof: Here the proof is by mathematical induction — on the value of ||.x, | . If || | = 0, then
xy = A and (xyx2)% = (Ax2)® = x8 = xR = xRAR = xf x® because A* = A from part (1)
of the recursive definition. Consequently, the result is true in this first case and this establishes
the basis step. For the inductive step we shall assume the result is true for all y, x2 € &*
where || y|| = & forsomek € N. Now consider what happens for x;, x2 € &* wherex,; = zy,
with ||z|| = 1 and ||y;|| = &. Here we find that (x,x2)* = (zy,.x2)* = (y1x2)*z [from part
(2) of the recursive definition] = xk yfz (from the induction hypothesis) = xk (zy1)* [again
by part (2) of the recursive definition] = xRXR Therefore the result is true for all x,, x. € X*
by the Principle of Mathematical Induction.
2. For X = {w, x, y, z} determine the number of strings in
ee Se >* of length 5 (a) that start with w; (b) with precisely two w’s;
(c) with no w’s; (d) with an even number of w’s.
1. Let & = {a, b,c, d, e}. (a) What is |=*|? |X7]? (b) How 3
3. Ifx € X* and ||x°|| = 36, what is ||x||?
many strings in &* have length at most 5?
318 Chapter 6 Languages: Finite State Machines
4. Let & = {B, x, y, z} where 8 denotes a blank, so xB # x, iii) The reversal function: r(A) = A; for x € Xt, if x =
BB # B, and xBy # xy but xAy = xy. Compute each of the XjX_+++Xy—1X,, Where x; € Y for all 1 <i <n, then
following: V(X) = X,Xp_—1 ++ X2X, = X* (as defined in Example
6.16).
a) |All b) ||AAl ¢) ||8|l
iv) The front deletion function: for x €¢ X*, ifx =
d) ||BA| e) ||6°|| f) ||xBBy|
X1X2X3-+++X,, then d(x) = x12x3 +++ Xp.
g) ||8A|I h) ||A""|
a) Which of these four functions is (or are) one-to-one?
5. Let D = {v, w, x, yz} and A= U"_, o>". How many
b) Determine which of these four functions is (or are) onto.
strings in A have xy as a proper prefix?
If a function is not onto, determine its range.
6. Let & be an alphabet. Let x; € © for 1 <i < 100 (where
c) Are any of these four functions invertible? If so, deter-
x, # x, for all 1 <i <j < 100). How many nonempty sub-
mine their inverse functions.
strings are there for the string s = x12 -- + X99?
d) Suppose that © = {a, e, i, o, uw}. How many wordsx in
7. For the alphabet © = {0, 1}, let A, B, C C X* be the fol-
©? satisfy r(x) = x? How many in £°? How many in 2’,
lowing languages:
where n € N?
A = {0, 1, 00, 11, 000, 111, OOOO, 1111},
e) Forx € &*, determine
B= {we X*|2 < |wll},
(do py)(x) and (rodoros,){x).
C= {we X*|2> ||w||}. f) If © ={a,e,i,0,u} and B= {ae, ai, ao, oo, eio,
Determine the following subsets (languages) of £*. eiouu} C L*, findr—'(B), p7'(B), s,'(B), and |d~'(B)|.
a) ANB b) A—B c) AAB 17. If A(¢ ) is a language and A? = A, prove that A = A*.
d) ANC e) BUC f) (ANC) 18. Provide the proofs for the remaining parts of Theorems 6.1
8. Let A = {10, 11}, B = {00, 1} be languages for the al- and 6.2,
phabet & = {0, 1}. Determine each of the following: (a) AB; 19. Prove that for al] finite languages A, B C d*, |AB| <
(b) BA; (c) A®; (d) B?. |A|| B].
9. If A, B,C, and D are languages over L, prove that 20. For & = {x, y}, use finite languages from L* (as in Ex-
(a) (ACBACCD)=> AC CBD; and (b) ADV = GA = &. ample 6.12), together with set operations, to describe the set
10. For & = {x, y, z}, let A, B C D* be given by A = {xy} of strings in &* that (a) contain exactly one occurrence of x;
and B = {A, x}. Determine (a) AB; (b) BA; (c) B?; (d) Bt; (b) contain exactly two occurrences of x; (c) begin with x;
(e) A*. (d) end in yxy; (e) begin with x or end in yxy or both;
11. Given an alphabet &, is there a language A C ©* where (f) begin with x or end in yxy but not both.
A* =A? 21. For & = {0, 1}, let A C 5* be the language defined recur-
12. For & = {0, 1} determine whether the string 00010 is in sively as follows:
each of the following languages (taken from £*). 1) The symbols 0, 1 are both in A — this is the base for our
a) {0, 1}* b) (000, 101}{10, 11} definition; and
c) {00}{0}" {10} d) {000}"{1}" {0} 2) For each word x in A, the word Ox1 is also in A — this
constitutes the recursive process.
e) {O0}*{10}" £) (O}*{1}*{0}*
13. For & = {0, 1} describe the strings in A* for each of the a) Find four different words — two of length 3 and two of
following languages A C =*. length 5—in A.
a) {01} b) {000} b) Use the given recursive definition to show that 0001111
c) {0, 010} d) {1, 10} isin A.
14. For & = {0, 1} determine all possible languages A, B C c) Explain why 00001111 is not in A.
&* where AB = {01, 000, 0101, 0111, 01000, 010111}. 22. Provide a recursive definition for each of the following lan-
15. Given a nonempty language A C ©*, prove that if A? = A, guages A C &* where & = {0, 1}.
thendA EA. a) x € Aif (and only if) the number of 0’s in x is even.
16. For a given alphabet ©, let a © & — with a fixed. Define b) x € A if (and only if) all of the 1’s in x precede all of
the functions p,, S,,r: &* — &* andthe functiond: Xt > X* the 0’s.
as follows: 23. Use the recursive definition given in Example 6.15 to verify
i) The prefix (by a) function: p,(x) = ax, x € D*. that each of the following strings is in the language A of that
ii) The suffix (by a) function: s,(x) = xa, x € X*. example.
6.2. Finite State Machines: A First Encounter 319
a) (C(O) b) (OO ce) (0) 27. For & = {0, 1}, let A, B C &*, where A is the language of
24. For an alphabet © a string x in &* is called a palindrome all strings in ©* of even length, while B is the language of all
ifx = x*® — that is, x is equal to its reversal. If A C &* where strings in £* of odd length. Give a recursive definition for each
A= {x € X*|x = x*}, how can we define the language A re- of the languages A, B.
cursively? 28. Let © = {a, b, c}. Determine the smallest number of words
25. For& = {0, 1}, let A C &*, where A = {00, 1}. How many one must select from £4 to guarantee that at least two of the
strings in A* have length 3? length 4? length 5? length 6? words start and end with the same letter.
26. For © = {0, 1}, let AC X*, where A = {00, 111}. How
many strings in A* have length 19?
6.2
Finite State Machines: A First Encounter
We return now to the vending machine mentioned at the start of this chapter and analyze it
in the following circumstance.
Ata metropolitan office, a vending machine dispenses two flavors of chewing gum (each
flavor in a package of five pieces): peppermint (P) and spearmint (S). The cost of a package
of either flavor is 20¢. The machine accepts nickels, dimes, and quarters and returns the
necessary change. One day Mary Jo decides she’d like a package of peppermint-flavored
chewing gum. She goes to the vending machine, inserts two nickels and a dime, in that order,
and presses the white button, denoted W. Out comes her package of peppermint-flavored
chewing gum. (To get a package of spearmint-flavored chewing gum one presses the black
button, denoted B.)
What Mary Jo has done, in making her purchase, can be represented as shown in Table 6.1,
where fg is the initial time, when she inserts her first nickel, and f;, tf, f3, t4 are later moments
in time, with ft; < fh <ft3 < ty.
Table 6.1
to ty th b ty
State (1) 50 (4) 5) (5¢) | (7) s2 10g) | (10) 53 (20¢) | (13) so
Input (2) 5¢ (5) 5¢ (8) 10¢ (11) W
Output | (3) Nothing | (6) Nothing | (9) Nothing | (12) P
The numbers (1), (2), ..., (12), (13) in this table indicate the order of events in the purchase of Mary Jo's
package of peppermint chewing gum. For each input at time z,, 0 <i < 3, there is at that time acorresponding
output and then a change in state. The new state at time 7,4; depends on both the input and the (present)
state at time ¢,.
The machine is in a state of readiness at state so. It waits for a customer to start inserting
coins that will total 20¢ or more and then press a button to get a package of chewing gum.
If at any time the total of the coins inserted exceeds 20¢, the machine provides the needed
change (before the customer presses the button to get the package of chewing gum).
At time tg Mary Jo provides the machine with her first input, 5¢. She receives nothing
at this time, but at the later time ft; the machine is in state s;, where it remembers her total
of 5¢ and waits for her second input (of 5¢ at time ¢,;). The machine again (at time ¢))
provides no output, but at the next time, fy, it is in state s2, remembering a total of 10¢ =
5¢ (remembered at state s;) + 5¢ (inserted at time ¢,). Providing her dime (at time 4) as
320 Chapter 6 Languages: Finite State Machines
the next input to the machine, Mary Jo does not receive a package of chewing gum at this
time because the machine doesn’t “know” which flavor Mary Jo prefers, but it does “know”
now (f3) that she has inserted the necessary total of 20¢ = 10¢ (remembered at state s2) +
10¢ (inserted at time fz). At last Mary Jo presses the white button, and at time fz the machine
dispenses the output (her package of peppermint chewing gum) and then returns, at time f4,
to the starting state so, just in time for Mary Jo’s friend Rizzo to deposit a quarter, receive
her nickel change, press the black button, and obtain the package of spearmint chewing
gum she desires. The purchase made by Rizzo is analyzed in Table 6.2.
Table 6.2
to ty h
State (1) 59 (4) 53 (20¢) | (7) 50
Input (2) 25¢ (5) B
Output | (3) S5¢ change | (6) S
What has happened in the case of this vending machine can be abstracted to help in the
analysis of certain aspects of digital computers and telephone communication systems.
The major features of such a machine are as follows:
1) The machine can be in only one of finitely many states at a given time. These states
are called the internal states of the machine, and at a given time the total memory
available to the machine is the knowledge of which internal state it is in at that
moment.
2) The machine will accept as input only a finite number of symbols, which collectively
are referred to as the input alphabet §. In the vending machine example, the input
alphabet is {nickel, dime, quarter, W, B}, each item of which is recognized by each
internal state.
3) An output and a next state are determined by each combination of inputs and internal
states. The finite set of al! possible outputs constitutes the output alphabet © for the
machine.
4) We assume that the sequential processings of the machine are synchronized by sepa-
rate and distinct clock pulses and that the machine operates in a deterministic manner,
where the output is completely determined by the total input provided and the starting
state of the machine.
These observations lead us to the following definition.
Definition 6.13 A finite state machine is a five-tuple M = (S, $, C, v, w), where S = the set of internal
states for WM; ¥ = the input alphabet for M; © = the output alphabet for M; v: S X # > $
is the next state function; and w: S X ¥ — CO is the output function.
Using the notation of this definition, if the machine is in state s at time ¢; and we input
x at this time, then the output at time /; is w(s, x). This output is followed by a transition
of the machine at time #;.; to the next internal state given by v(s, x).
We assume that when a finite state machine receives its first input, we are at time fo = 0
and the machine is in a designated starting state denoted by sp. Our development will
6.2 Finite State Machines: A First Encounter 321
concentrate primarily on the output and state transitions that take place sequentially, with
little or no reference to the sequence of clock pulses at times fg, t), fo, -. .
Since the sets S$, #, and © are finite, it is possible to represent v and w, for a given finite
state machine, by means of a table that lists v(s, x) and w(s, x) forall s € Sandallx ef.
Such a table is referred to as the state table or transition table for the given machine. A
second representation of the machine is made by means of a state diagram.
We demonstrate the state table and state diagram in the following examples.
Consider the finite state machine M = (S, $, C, v, w), where S = {s9, 5), 52}, J =C =
EXAMPLE 6.17
{O, 1}, and v, w are given by the state table in Table 6.3. The first column of the table lists
the (present) states for the machine. The entries in the second row are the elements of the
input alphabet #, listed once under v and then again under w. The six numbers in the last
two columns (and last three rows) are elements of the output alphabet ©.
Table 6.3
yp @
0 1 0 1
SQ 50 5] 0 0
Sj S52 St 0 0
$2 So Sy 0 1
To calculate v(s,, 1), for example, we find s, in the column of present states and proceed
horizontally over from s; until we are below the entry 1 in the section of the table for v.
This entry gives v(s;, 1) = s;. In the same way we find w(s,, 1) = 0.
With so designated as the starting state, if the input provided to M is the string 1010,
then the output is 0010, as demonstrated in Table 6.4. Here the machine is left in state s>,
so that if we had another input string, we would provide the first character of that string,
here Q, at state sz unless the machine is resef to start once again at sg.
Table 6.4
State SO v(so, Ll) = sy | v(sy, 0) = 52 | v(s2, 1) = 5) | v(s;, 0) = 52
Input 1 0 1 0 0
Output | w(so, 1) = 0 | w(s;,0) =0 | @(s2, 1) =1 | ws}, 0) =O
Since we are primarily interested in the output, not in the sequence of transition states,
the same machine can be represented by means of a state diagram. Here we can obtain the
output string without actually listing the transition states. In such a diagram each internal
state s is represented by a circle with s inside of it. For states s; and s;, if v(s;, x) = s;
for x ¢ J, and w(s,, x) = y for y € C, we represent this in the state diagram by drawing a
directed edge (or arc) from the circle for s, to the circle for s; and labeling the arc with the
input x and output y as shown in Fig. 6.1.
With these conventions, the state diagram for the machine M of Table 6.3 is shown in
Fig. 6.2. Although the table is more compact, the diagram enables us to follow an input
string through each transition state it determines, picking up each of the corresponding
322 Chapter 6 Languages: Finite State Machines
Figure 6.1
output symbols before each transition. Here if the input string is 00110101, then starting at
state so, the first input of 0 yields an output of 0 and returns us to so. The next input of 0
yields the same result, but for the third input, 1, the output is 0 and we are now in state 5}.
Continuing in this manner, we arrive at the output string 00000101 and finish in state sy.
(We note that the input string 00110101 is an element of $*, the Kleene closure of #, and
that the output string is in ©*, the Kleene closure of ©.)
Starting at so, what is the output string for the input string 1100101101?
Figure 6.2
| EXAMPLE 6.18 For the vending machine described earlier in this section, we have the state table, Table 6.5,
with
1) S = {so, 51, 82, 53, 54}, where at state s;, for each 0 < k < 4, the machine remembers
retaining 5k cents.
2) J = {5¢, 10¢, 25¢, B, W}, where B denotes the black button one presses for a pack-
age of spearmint-flavored chewing gum and W the white button for a package of
peppermint-flavored chewing gum.
3) © = {n (nothing), P (peppermint chewing gum), S (spearmint chewing gum), 5¢,
10¢, 15¢, 20¢, 25¢}.
Table 6.5
y @
5¢ 10¢ 25¢ B WwW 5¢ 10¢ 25¢ B WwW
SO S| s2 S4 50 SO n n 5¢ n n
Sy $2 53 S4 Sy S] n n 10¢ n n
S2 53 54 S4 52 $2 n n 15¢ n n
$3 $4 S4 S4 $3 $3 n 5¢ 20¢ n n
S4 S4 S4 54 SO SO 5¢ 10¢ 25¢ S P
As we observed in the discussion just prior to Example 6.18, for a general finite state
machine M = (S, ¥, ©, v, w), the input can be realized as an element of #*, with the output
6.2 Finite State Machines: A First Encounter 323
from C*. Consequently, it is to our advantage to extend the domains of v and w from S X F
to S X $*. For w we enlarge the codomain to ©*, recalling, should the need arise, that
both $* and ©* contain the empty string, A. With these extensions, if x).x2-- +» x; € $*, for
k € Z*, then starting at any state s; € S, we have
v(s}, X1) = 89!
V(81, X1X2) = v(v(sy, X1), X2) = v(s2, X2) = 83
V(S1, X1X2X3) = v(v(v(s1, X1), X2), X3) = v(93, x3) = 54
' re, nena |
So
V(S2, X2) = 83
VCS], X1XQ +++ Xe) = VCS, XK) = S41, and
@(S},X1) = yy
(S|, XyX2) = w(S], X1JO(V(S1, X1), X2) = O(81, X1)@(S2, X2) = yiy2
@(S1, X1X2X3) = WS}, X1)ew(S2, X2)H(53, X3) = Yr y2¥3
WS}, X1X2-++- XR) = (51, X1)@(S2, X2) +> + WCSK, XE) = Vi Yr2-°- Ye € O*
Also, v(s,, A) = s; forall s,; € S.
(We shall use these extensions again in Chapter 7.)
We close this section with an example that 1s relevant in computer science.
EXAMPLE 6.19 Let x = x5x4x3X%2X; = 00111 and y = ys5y4y3y2y; = 01101 be binary numbers where x;
and y, are the least significant bits. The leading 0’s in x and y are there to make the strings
for x and y of equal length and to guarantee enough places to complete the sum. A serial
binary adder is a finite state machine that we can use to obtain x + y. The diagram in
Fig. 6.3 illustrates this, where z = 252423222, has the least significant bit z).
X = X5X4X3X2X1——> Serial
binary = -}——® Z = 25242322)
Y = YsYaY3¥2¥i ——> adder
Figure 6.3
In the addition z = x + y, we have
x= 00 1 1 1
+y=+0 ]
Z= 1 0 0
third first
addition addition
We note that for the first addition x; = y,; = 1 and z, = 0, whereas for the third addition
we have x3 = y3 = | and z3 = | because of a carry from the addition of x2 and y2 (and the
' The state 52 is determined by s; and x. It is not simply the second in a predetermined list of states.
324 Chapter 6 Languages: Finite State Machines
carry from x; + y1). Consequently, each output depends on the sum of two inputs and the
ability to remember a carry of 0 or 1, which is crucial when the carry is 1.
The serial binary adder is modeled by a finite state machine M = (S, $, ©, v, w) as
follows. The set S$ = {so, 51}, where s; indicates a carry ofi; # = {00, 01, 10, 11}, so there
is a pair of inputs, depending on whether we are seeking 0+0,0+1,1+4+0, or 141,
respectively; and © = {0, 1}. The functions v, w are given in the state table (Table 6.6) and
the state diagram (Fig. 6.4).
Table 6.6
00 =—sO01 10 11 | 00 =~O1 10 11
so | So SO SO s; | 0 1 ] 0
Sy SQ 5S} Ss] S] 1 0 0
—"
In Table 6.6 we find, for example, that v(s;, 01) = s; and w(s;, 01) = 0, because s;
indicates a carry of 1 from the addition of the previous bits. The 01 input indicates that we
are adding 0 and 1 (and carrying a 1). Hence the sum is 10 and w(s;, 01) = 0 for the Oin
10. The carry is again remembered in s; = v(s;, 01).
From the state diagram (Fig. 6.4) we see that the starting state must be sg because there
is no carry prior to the addition of the least significant bits.
Start
Figure 6.4
The state diagrams in Figs. 6.2 and 6.4 are examples of labeled directed graphs. We
shall see more about graph theory throughout the text, for it has applications not only in
computer science and electrical engineering but also in coding theory (prefix codes) and
optimization (transport networks).
Table 6.7
EXERCISES 6.2
v @
1. Using the finite state machine of Example 6.17, find the out-
put for each of the following input strings x € $*, and determine a bee a be
the last internal state in the transition process. (Assume that we So | So s3 S2|O 1 1
always start at sy.)
Ss, } 8) Ss; 83 |O O 1
a) x = 1010101 —_—b) x = 1001001 c) x = 101001000 ois s; #311 1 0
2. For the finite state machine of Example 6.17, an input string 53 | S 83 So | 1 O 1
X, Starting at state sy, produces the output string 00101. Deter-
mine x.
3. Let M = (S, ¥, ©, v, w) be a finite state machine where a) Starting at sy, what is the output for the input string
S = {S0, 51, $2, 83}, = fa, b, c}, C = {O, 1}, and v, w are de- abbccc?
termined by Table 6.7. b) Draw the state diagram for this finite state machine.
6.2 Finite State Machines: A First Encounter 325
4. Give the state table and the state diagram for the vending 8. Let M = (S, ¥, ©, v, w) be a finite state machine with # =
machine of Example 6.18 if the cost of a package of chewing © = {0, 1} and S, v, and w determined by the state diagram
gum (peppermint or spearmint) is increased to 25¢. shown in Fig. 6.7.
5. A finite state machine M = (S, , C, v,w) has $ =O =
{O, 1} and is determined by the state diagram shown in Fig.
6.5.
Start
Figure 6.5
Figure 6.7
a) Determine the output string for the input string 110111,
starting at sy. What is the last transition state?
a) Find the output for the input string x = 0110111011.
b) Answer part (a) for the same string but with s; as the
b) Give the transition table for this finite state machine.
starting state. What about s2 and s3 as starting states?
c) Starting in state so, if the output for an input string x is
c) Find the state table for this machine.
0000001, determine all possibilities for x.
d) In which state should we start so that the input string d) Describe in words what this finite state machine does.
10010 produces the output 10000?
9. a) Find the state table for the finite state machine in Fig. 6.8,
e) Determine an input string x € $* of minimal length, such
where# = © = {0, 1}.
that v(s4, x) = 5s). 1s x unique?
6. Machine M has ¥ = {0, 1} = € and is determined by the
state diagram shown in Fig. 6.6.
1,0 0,1
Figure 6.6
a) Describe in words what this finite state machine does.
b) What must state s; remember?
c) Find two languages A, B C $* such that for every x € Figure 6.8
AB, w(sp, x) has 1 as a suffix.
b) Letx € $* with ||x|| = 4. If1 is a suffix of w(sy, x), what
7. a) If S, , and € are finite sets, with |S| = 3, |¥| =5, and
are the possibilities for the string x?
|\O| = 2, determine (i) |S X $|; (ii) the number of func-
tions v: § X § > S; and (iii) the number of functions c) Let A € {0, 1}* be the language where w(so, x) has | as
wo SXF > €. a suffix for all x in A. Determine A.
b) For S, ¥, and € in part (a), how many finite state machines d) Find the language A C {0, 1}* where @ (so, x) has 111 as
do they determine? a suffix for all x in A.
326 Chapter 6 Languages: Finite State Machines
6.3
Finite State Machines: A Second Encounter
Having seen some examples of finite state machines, we turn to the study of some additional
machines that are relevant to the design of computer hardware. One important type of
machine is the sequence recognizer.
Here, § = © = {0, 1}, and we want to construct a machine that recognizes each occurrence
EXAMPLE 6.20
of the sequence 111 as it is encountered in an input string x € ¥*. For example, if x =
1110101111, then the corresponding output should be 0010000011, where a 1 in the ith
position of the output indicates that a 1 can be found in positions 7, 7 — 1, andi — 2 of x.
Here overlapping of sequences of 111 can occur, so some characters in the input string can
be thought of as characters in more than one triple of 1’s.
Letting so denote the starting state, we realize that we must have a state to remember
1 (the possible start of 111) and a state to remember 11. In addition, any time our input
symbol is 0, we go back to sg and start the search for three successive 1’s over again.
In Fig. 6.9, s; remembers a single 1, and s. remembers the string 11. If s2 is reached,
then a third “1” indicates the occurrence of the triple in the input string, and the output 1
recognizes this occurrence. But this third “1” also means that we have the first two 1’s of
another possible triple coming up in the string (as happens in 11101011 “1” 1). So after
recognizing the occurrence of 111 with an output of 1, we return to state sz to remember
the two inputs of1 “1”.
Figure 6.9
If we are concerned with recognizing all strings that end in 111, then for each x € ¥*,
the machine will recognize such a sequence with final output 1. This machine is then a
recognizer of the language A = {0, 1}*{111}.
Another finite state machine that recognizes the same triple 111 is shown in Fig. 6.10.
The finite state machines represented by the state diagrams in Figs. 6.9 and 6.10 perform
Figure 6.10
6.3 Finite State Machines: A Second Encounter 327
the same task and are said to be equivalent. The state diagram in Fig. 6.10 has one more
state than that in Fig. 6.9, but at this stage we are not overly concerned with getting a finite
state machine with a minimal number of states. In Chapter 7 we shall develop a technique to
take a given finite state machine M and find one that is equivalent to it and has the smallest
number of internal states needed.
The next example is a bit more selective.
Now we want to not only recognize the occurrence of 111 but we want to recognize only
EXAMPLE 6.21
those occurrences that end in a position that is a multiple of three. Consequently, with # =
© = {0, 1}, ifx € ¥*, wherex = 1110111, then we want w(so,.x) = 0010000, not 0010001.
In addition, for x € %*, where x = 111100111, the output w(so, x) is to be 001000001, not
001100001, for here, because of length considerations, overlapping of sequences of 111 is
not allowed.
Again we start at so (Fig. 6.11), but now s; must remember a first 1 only if it occurs in x
in position 1, 4, 7, .... If the input at so is 0, we cannot simply return to sp as in Example
6.20. We must remember that this 0 is the first of three symbols of no interest. Hence from
So We go to s3 and then to s4, processing any triple of the form Oyz where 0 occurs in x in
position 3k + 1, k => 0. The same type of situation happens at s; if the input is 0. Finally, at
s2 the sequence 111 is recognized with an output of 1, if it occurs. The machine then returns
to sg to input the next symbol of the input string.
1,1
Start
Figure 6.11
Figure 6.12 shows the state diagrams for finite state machines that will recognize the occur-
EXAMPLE 6.22
rence of the sequence 0101 in an input string x € J*, where § = C = {0, 1}. The machine
in Fig. 6.12(a) recognizes with an output of 1 each occurrence of 0101 in an input string, re-
gardless of where it occurs. In Fig. 6.12(b) the machine recognizes with an output of 1 only
those prefixes of x whose length is a multiple of four and that end in 0101. (Hence no over-
lapping is allowed here.) Consequently, for x = 01010100101, w(so, x) = 00010100001
for (a), whereas for (b), @(so, x) = 00010000000.
Now that we have examined some finite state machines that serve as sequence recogniz-
ers, itis only fair to consider a set of sequences that cannot be recognized bya finite state
machine. This example gives us another opportunity to apply the pigeonhole principle.
Let § = € = {0, 1}. Can we construct a finite state machine that recognizes precisely those
EXAMPLE 6.23
strings in the language A = {01, 0011, OOO111, ...} = {0'l'|i € Z*}? If we can, thenif so
328 Chapter 6 Languages: Finite State Machines
(a) (0)
Figure 6.12
denotes the starting state, we shall expect w(so, 01) = 01, w(so, 0011) = 0011, and, in
general, (59, 0'1') = 0'1', foralli € Z*. [Note: Here, forexample, we want w(so, 0011) =
0011, where the first 1 in the output is for recognition of the substring 01 and the second 1
is for recognition of the string 0011.]
Suppose that there is a finite state machine M = (S, $, ©, v, w) that can recognize
precisely those strings in A. Let so € S, where so is the starting state, and let |S] =n > 1.
Now consider the string 0”*'1”*! in the language A. If our machine M is to operate correctly,
then we want w(so, 0"+!1"*') = 0"+'1"+1. Therefore, we see in Table 6.8 how this finite
state machine will process the n + 1 0’s, starting at the state so, then continuing at the n
states s] = v(so, 0), 52 = v(s;, 0), ..., ands, = v(s,_1, 0). Since |S] = n, by applying the
pigeonhole principle to the n + 1 states so, 51, 82, ..., Sn—1, Sn, We realize that there are
two states s; and s, wherei < j but s; = s;.
Table 6.8
State So | S31 | Sa]... | Say | Sn | Saga |e. . | Son | Sona
Input 0O/0}0]... 0 0 ] Lee 1 1
Output | 0/0] 0]... 0 0 1 Lae 1 1
Now in Table 6.9 we see how the removal of the j — i columns
— for states Sitly eves
s; —results in Table 6.10. This table shows us that the finite state machine M recognizes
the string x = O"F)-U)
1" 41, where n + 1 — (j —i) <n +1. Unfortunately x ¢ A, so
M recognizes a string that it is not supposed to recognize. This demonstrates that we cannot
construct a finite state machine that recognizes precisely those strings in the language
A = {0'l'|i € Z*}.
Table 6.9
State So] St] 52) ees |S | Sei jee. | Sy | Sta | eee | Se | Saga | ee. | San | Sanat
Input | 0 | 0| O 0 0 0 0 0 1 I 1
Output | 0 | 0 | 0 0 0 0 0 0 1 l 1
6.3 Finite State Machines: A Second Encounter 329
Table 6.10
State So | Si | Sop... | Se | Sper | eee | Sn | Sati [e+ | San | S2nqi
Input | 0 | 0 | O|} ... | 0 0 we. | O 1 Lae 1 1
Output | 0; 0] O|} ... | 0 0 ... | O 1 we. | ]
A class of finite state machines that is important in the design of digital devices consists
of the k-unit delay machines, where k € Z*. For k = 1, we want to construct a machine M
such that ifx = x)x2 +++ Xm—1Xm, then for starting state 59, w(so, x) = Ox) x2 +++ Xm-—1, SO
that the output is the input delayed one time unit (clock pulse). [The use of 0 as the first
symbol in w(so, x) is conventional.]
Let # = O = {0, 1}. With starting state sg, w(so, x) = 0 for x = 0 or 1 because the first
EXAMPLE 6.24 output is QO; the states s; and s2 (in Fig. 6.13) remembera prior input of 0 and 1, respectively.
In the figure, we label, for example, the arc from s; to sy with 1, 0 because with an input of
1 we need to go to s2 where inputs of | at time ¢; are remembered so that they can become
outputs of 1 at time ¢;,). The 0 in the label 1, 0 is the output because starting in s; indicates
that the prior input was 0, which becomes the present output. The labels on the other arcs
are obtained by the same type of reasoning.
Figure 6.13
Observing the structure of a one-unit delay, we extend our ideas to the two-unit delay
EXAMPLE 6.25 machine shown in Fig. 6.14. Ifx € #*, let x = x;x2---X where m > 2; if sg is the starting
state, then w(so, x) = 00x) - + x,—2. For states sg, 51, 52 the output is 0 for all possible
inputs. States 53, 54, 85, and sg must remember the two prior inputs 00, 01, 10, and 11,
respectively. To get the other arcs in the diagram, we shall consider one such arc and then
use similar reasoning for the others. For the are from s5 to 53 in Fig. 6.14(a), let the input
be 0. Since the prior input to s5 from sz is 0, we must go to the state that remembers the
two prior inputs 00. This is state s3. Going back two states from 55 to sz to 59, we see that
the input is 1 (from sg to 52). This then becomes the output (delayed two units) for the arc
from ss to s3. The complete machine is shown in part (b) of Fig. 6.14.
330 Chapter 6 Languages: Finite State Machines
(a)
Figure 6.14
We turn now to some additional properties that arise in the study of
finite state machines,
The machine in Fig. 6.15 will be used for examples of the terms defined.
Definition 6.14 Let M = (S, ¥, ©, v, w) be a finite state machine.
a) For s;,5; € S,s j 18 said to be reachable from 5; if s; = s j or if there
is an input string
x € $* such that v(s;, x) = s;. (In Fig. 6.15, state s3 is reachable
from SQ, S$}, S2, and
s3 but not from s4, 55, 56, or s7. No state is reachable from
53 except s3 itself.)
b) Astate s € Sis said to be transient
if v(s, x) = sforx € $* impliesx = A; that is, there
isno x € $* with v(s, x) = 5s. (For the machine in Fig. 6.15, so
is the only transient
state.)
Figure 6.15
6.3 Finite State Machines: A Second Encounter 331
c) Astate s € S$ is called a sink, or sink state, if v(s, x) = s, forall x € $*. (s3 1s the only
sink in Fig. 6.15.)
d) Let S$; CS, 9; CF. If vy = vis,x9,: Sy X F; > S (that is, the restriction of v to
S, X $; CS X F)has its range within S$, then with w; = @|s,x5,, M1 = (Si, 91, €,
v;, @;) is called a submachine of M. (With S, = {s4, 55, 56, 87}, and #; = {0, 1}, we
get a submachine M, of the machine M in Fig. 6.15.)
e) A machine is said to be strongly connected if for any states s;, s; € S, 5; 1s reachable
from s;. (The machine in Fig. 6.15 is not strongly connected, but the submachine M,
in part (d) has this property.)
We close this section with a concept that uses a tree diagram.
Definition 6.15 For a finite state machine M, let s;, s; be two distinct states in $. An input string x € $+ is
called a transfer (or transition) sequence from s; to s; if
a) v(s;, x) = s;, and
b) y € F* with v(s;, y) = 5; > llyll = Ile.
There can be more than one such (shortest) sequence for two states s;, $j.
Find a transfer sequence from state sg to state s2 for the finite state machine M given by the
EXAMPLE 6.26 state table in Table 6.11, where # = © = {0, 1}.
So
Table 6.11
v @
0 | 0 1
SO S56 Ss] 0 ]
5] S55 50 0 ]
S2 Sy $2 0 1
53 S4 So 0 1
S4 S2 Sq 0 1
55 §3 S5 ] ]
S56 S53 S6 1 1 .
Figure 6.16
In constructing the tree diagram of Fig. 6.16, we start at state sy and find those states
that can be reached from sg by using strings of length 1. Here we find s; and s¢. Then we
do the same thing with s; and s¢, finding, as a result, those states reachable from sp with
input strings of length 2. Continuing to expand the tree from left to right, we get to a vertex
labeled with the desired state, s.. Each time we reach a vertex labeled with a state used
previously, we terminate that part of the expansion because we cannot reach any new states.
After we arrive at the state we want, we backtrack to sy and use the state table to label the
branches, as shown in Fig. 6.16. Hence, forx = 0000, v(sg, x) = s2 with w(sg, x) = 0100.
(Here x is unique.)
332 Chapter 6 Languages: Finite State Machines
Table 6.12
EXERCISES 6.3
v @
1. Let § = G = {0, 1}. (a) Construct a state diagram for a fi-
0 1
nite state machine that recognizes each occurrence of 0000 in
a string x € $*. (Here overlapping is allowed.) (b) Construct SQ SOQ 5]
a State diagram for a finite state machine that recognizes each Ss] So S|
string x € $* that ends in 0000 and has length 4k, k € Z*. (Here
overlapping is not permitted.)
c) Describe in words what machine M does.
2. Answer Exercise 1 for each of the sequences 0110 and 1010.
d) How is this machine related to that shown in Fig. 6.13?
3. Construct a state diagram for a finite state machine with
6. Show that it is not possible to construct a finite state ma-
S$ =C = (0, 1} that recognizes all strings in the language
chine that recognizes precisely those sequences in the lan-
{O, 1}*{00}U {0, 1}* {11}.
guage A = {0'l/|i, j € Z*,i > j}. (Here the alphabet for A
4, For # = © = {0, 1} astringx € J* is said to have even parity is & = {0, 1}.)
if it contains an even number of 1’s. Construct a state diagram
7. For each of the machines in Table 6.13, determine the tran-
for a finite state machine that recognizes all nonempty strings
sient states, sink states, submachines (where #, = {0, 1}), and
of even parity.
strongly connected submachines (where ¥; = {0, 1}).
5. Table 6.12 defines v and w for a finite state machine M where
8. Determine a transfer sequence from state s; to state ss in
FS = 6 = {0, I}.
finite state machine (c) of Exercise 7. Is your sequence unique?
a) Draw the state diagram for M.
b) Determine the output for the following input sequences,
starting at Sy in each case: (i) x = 111; (ii) x = 1010;
(iit) x = 0001).
Table 6.13
v w® v @ y @
0 1 0 1 0 ] 0 1 0 1 0 1
SO S4 S| 0 0 50 SO S| 1 0 SO S| 52 0 1
5] S48 0 ] S] So. SY 0 ] S] So 82 1 1
S2 S3 S55 0 0 S2 S| S53 0 0 S52 S52 53 1 1
$3 S285 1 0 $3 So Sg 0 O $3 So S4 0 0
54 S4 S54 1 1 S4 54 S4 1 ] S54 S5 S55 1 0
S5 S2 S53 0 1 S5 S3 S4 ] 0
So | S& S&S | O O
{a) (b) (c)
6.4
Summary and Historical Review
In this chapter we have been introduced to the theory of languages and to a discrete structure
called a finite state machine. Using our prior development of elementary set theory and finite
functions, we were able to combine some abstract notions and to model digital devices
such as sequence recognizers and delays. Comparable coverage of this material appears in
6.4 Summary and Historical Review 333
Chapter 1 of L. L. Dornhoff and FE. Hohn [3] and in Chapter 2 of D. F. Stanat and D. F.
McAllister [15].
The finite state machine we developed is based on the model put forth in 1955 by G. H.
Mealy in [11] and is consequently referred to as the “Mealy machine.” The model is based
on earlier concepts found in the work of D. A. Huffman [8] and E. F Moore [13]. For further
reading on the pioneering work dealing with various aspects and applications of the finite
state machine, consult the material edited by E. F Moore [14]. Additional information
on the actual synthesis of such machines and on related hardware considerations, along
with an extensive coverage of many related ideas, can be found in Chapters 9-15 of
Z. Kohavi [9].
For more on languages and their relation to finite state machines, one should look into
the UMAP module by W. J. Barnier [1], Chapter 8 of J. L. Gersting [4], and Chapters 7
and 8 of A. Gill [5]. A comprehensive coverage of these (and related) topics is given in the
texts by J. G Brookshear [2], J. E. Hopcroft and J. D. Ullman [7], H. R. Lewis and C. H.
Papadimitriou [10], M. Minsky [12], and D. Wood [16].
David Hilbert Alan Mathison Turing (1912-1954)
(1862-1943) Reproduced courtesy of The Granger Collection, New York
One may be surprised to learn that the basic ideas of automata theory were developed to
solve rather theoretical questions in the foundations of mathematics — as posed in 1900 by
the German mathematician David Hilbert (1862-1943), In 1935 the English mathematician
and logician Alan Mathison Turing (1912-1954) became interested in Hilbert’s decision
problem, which asked if there could be a general method one could apply to a given state-
ment in order to determine if that statement were true. Turing’s approach to the solution of
this problem led him to develop what is now known as a Turing machine, the most general
model for a computing machine. By using this model, he was able to establish very pro-
found theoretical results about how computers should have to operate— before any such
machines were actually built. During World War II Turing worked for the Foreign Office at
Bletchley Park, where he did extensive work on the cryptanalysis of Nazi ciphers. His ef-
forts contributed to the breaking of the mechanical cipher machine Enigma, a breakthrough
that helped to bring about the defeat of the Third Reich. Following the war (and up to the
time of his death), Turing’s interest in the ability of machines to think led him to play a
major role in the development of actual (not just theoretical) computers. For more on the
life of this interesting scholar one should look into the biography by A. Hodges [6].
334 Chapter 6 Languages: Finite State Machines
REFERENCES
1. Barnier, William J. “Finite-State Machines as Recognizers” (UMAP Module 671). The UMAP
Journal 7, no. 3 (1986): pp. 209-232.
2. Brookshear, J. Glenn. Theory of Computation: Formal Languages, Automata, and Complexity.
Reading, Mass.: Benjamin/Cummings, 1989.
. Dornhoff, Larry L., and Hohn, Franz E. Applied Modern Algebra. New York: Macmillan, 1978.
. Gersting, Judith L. Mathematical Structures for Computer Science, 5thed. New York: Freeman,
2003.
. Gill, Arthur. Applied Algebra for the Computer Sciences, Prentice-Hall Series in Automatic
Computation. Englewood Cliffs, N.J.: Prentice-Hall, 1976.
. Hodges, Andrew. Alan Turing: The Enigma. New York: Simon and Schuster, 1983.
. Hopcroft, John E., and Ullman, Jeffrey D. Introduction to Automata Theory, Languages, and
Computation. Reading, Mass.: Addison-Wesley, 1979.
. Huffman, D. A. “The Synthesis of Sequential Switching Circuits.” Journal of the Franklin
Institute 257 (March 1954): pp. 161-190, (April 1954): pp. 275-303. Reprinted in Moore
[14].
. Kohavi, Zvi. Switching and Finite Automata Theory, 2nd ed. New York: McGraw-Hill, 1978.
. Lewis, Harry R., and Papadimitriou, Christos H. Elements of the Theory of Computation, 2nd
ed. Englewood Cliffs, N.J.: Prentice-Hall, 1997.
11. Mealy, G. H. “A Method for Synthesizing Sequential Circuits.” Bell System Technical Journal
34 (September 1955): pp. 1045-1079.
12. Minsky, Marvin. Computation: Finite and Infinite Machines. Englewood Cliffs, N.J.: Prentice-
Hall, 1967.
13. Moore, E. F. “Gedanken-experiments on Sequential Machines.” Automata Studies, Annals of
Mathematical Studies, no. 34: pp. 129-153. Princeton, N.J.: Princeton University Press, 1956.
14. Moore, E. F., ed., Sequential Machines: Selected Papers. Reading, Mass.: Addison-Wesley,
1964.
15. Stanat, Donald F., and McAllister, David F. Discrete Mathematics in Computer Science. En-
glewood Cliffs, N.J.: Prentice-Hall, 1977.
16. Wood, Derick. Theory of Computation. New York: Wiley, 1987.
SUPPLEMENTARY EXERCISES
4. For © = {0,1} consider the languages A, B,C C £*
where A = {01,11}, B = {01, 11, 111}, and C = {01, 11,
1111}. (a) How are A* and B* related? (b) How about A* and
c*?
1, Let ©; ={w,x, y} and X= {x, y,z} be alphabets.
5. Let M be the finite state machine shown in Fig. 6.17. For
If A, ={x'y/|i, fEZ*, jf >i > 1), Ao ={w'y/ |i, fj EZ, states s,,s,, where 0 <i, j <2, let ©,, denote the set of all
i> j> 1}, Az ={wixiy'2/|i, fe Zt, j>i> 1}, and Ay = nonempty output strings that M can produce as it goes from state
{z/(wz)'w |i, j €Z*,i > 1, j => 2}, determine whether each
s, to state s,. Ifi = 2, 7 = 0, for example, Cry = {O}{1, OO}*.
of the following statements is true or false.
Find Co2, ©22, O11, Ooo, and Oro.
a) A, is a language over ©).
b) A: is a language over Xp.
c) A; is a language over X; U Xo.
d) A, is a language over £11 Xp.
e) Aq is a language over ©; A Xp.
f) A, U A: is a language over X).
2. For languages A, B C &*, does A* C B* > ACB?
3. Give anexample ofa language A over an alphabet ©, where
(A*)* # (At. Figure 6.17
Supplementary Exercises 335
6. Let M be the finite state machine in Fig. 6.18. 10. With # = © = {0, 1}, let M be the finite state machine given
in Table 6.15. Here sy is the starting state. Let A C $* where
x € Aifand only if the last symbol in (sp, x) is 1. [There may
be more than one 1 in the output string @(sy, x).] Construct a
finite state machine wherein the last symbol of the output string
is 1 forall ye $* — A.
Table 6.15
y @
0 ] 0 1
SO S| S52 ] 0
Sy 82 S| 0 l
Figure 6.18 52 S2 S53 0 1
S53 S| SO 1 0
a) Find the state table for this machine.
b) Explain what this machine does.
11. Let # = © = {0, 1} for the two finite state machines M,
c) How many distinct input strings x are there such that and M2, given in Tables 6.16 and 6.17, respectively. The start-
|x|] = 8 and v(sy,x) = so? How many are there with ing state for M) is so, whereas s3 is the starting state for M).
I|x|| = 12?
7, Let M =(S, $, 6, v, w) be a finite state machine with Table 6.16 Table 6.17
|\S| =n, and letOe §.
vy @) v2 @2
a) Show that for the input string 0000... , the output is
eventually periodic. O 1 {0
oS
—
—
—
OQ
b) What is the maximum number of 0’s we can input before SQ So Sy 1 S3 S3 S4 1 1
OOS
the periodic output starts? Ss; | Ss} So | O 54 | So S311 O
c) What is the length of the maximum period that can 52 S52 So 0
eS
occur?
8. For § = C = {0, 1}, let M be the finite state machine given
in Table 6.14. If the starting state for M is not s,, find an in- We connect these machines as shown in Fig. 6.19. Here
put string x (of smallest length) such that v(s;, x) = 5), for all
each output symbol from M, becomes an input symbol for
i = 2,3, 4. (Hence x gets the machine M to state s; regardless M),. For example, if we input 0 to Mj), then @)(59, 0) = 1 and
of the starting state.) v1 (59, O) = sy. As a result, we then input 1 (= @) (so, 0)) to M2
to get @2(53, 1) = 1 and v2(s3, 1) = 54.
Table 6.14
v 7)
—> MM, My, -—>
0 1 0 1
S] S4 53 0 0 Figure 6.19
S2 S2 S4 0 ]
We construct a machine M = (S, #, ©, v, w) that represents
53 5} $2 1 0 this connection of M, and M> as follows:
S4 S| S4 1 1
J = € = {0, 1}.
9, Let § = © = {0, 1}. Construct a state diagram for a finite S = S$, X So, where S, is the set of internal
state machine that reverses (from 0 to 1 or from 1 to 0) the states for M,, fori = 1, 2.
symbols appearing in the 4th, in the 8th, in the 12th, ..., posi-
viSX$—+>S, where
tions of an input string x € $+. For example, if sp is the starting
state, then w(sp, 0000) = 0001, (sy, 000111) = 000011, and v((s, t), x) = (vi(s, x), v(t, a 6s, x), fors € 5), 1 € Ss,
w(sy, 000000111) = 000100101. andx € §.
336 Chapter 6 Languages: Finite State Machines
w: SX $—+€, where suggests the use of a matrix or two-dimensional array for stor-
w((s, t), x) = @o(t,
a (s,x)), fors
€ $,,t € S,,andx € §&. ing v, w. : Use this observation
_ to write a Eee
program (or develop
an algorithm) that will simulate the machine in Table 6.18.
a) Find a state table for machine M. Table 6.18
b) Determine the output string for the input string 1101.
After this string is processed, in which state do we find v @
(i) machine M,? (ii) machine M2? 0 I 0 1
12. Although the state diagram seems more convenient than
the state table when we are dealing with a finite state machine 51 $2 S] 0 0
M =(S, §, ©, v, w), as the input strings get longer and the sizes $2 53 5] 0 0
of S, J, and © increase, the state table proves useful when sim- $3 $3 S} 1 1
ulating the machine on a computer. The block form of the table
Relations: The
Second Time Around
I" Chapter 5 we introduced the concept of a (binary) relation. Returning to relations in this
chapter, we shall emphasize the study of relations on a set A — that is, subsets of A X A.
Within the theory of languages and finite state machines from Chapter 6, we find many
examples of relations on a set A, where A represents a set of strings from a given alphabet
or a set of internal states from a finite state machine. Various properties of relations are
developed, along with ways to represent finite relations for computer manipulation. Directed
graphs reappear as a way to represent such relations. Finally, two types of relations on a set
A are especially important: equivalence relations and partial orders. Equivalence relations,
in particular, arise in many areas of mathematics. For the present we shall use an equivalence
relation on the set of internal states in a finite state machine M in order to find a machine
M,, with as few internal states as possible, that performs whatever tasks M is capable of
performing. The procedure is known as the minimization process.
71
Relations Revisited:
Properties of Relations
We start by recalling some fundamental ideas considered earlier.
Definition 7.1 For sets A, B, any subset of A X B is called a (binary) relation from A to B. Any subset
of A X A is called a (binary) relation on A.
As mentioned in the sentence following Definition 5.2, our primary concern is with
binary relations. Consequently, for us the word “relation” will once again mean binary
relation, unless something otherwise is specified.
| EXAMPLE 71 | a) Define the relation ® on the set Z by a KR b, or (a, b) € KR, tf a < b. This subset of
" Z X Zis the ordinary “less than or equal to” relation on the set Z, and it can also be
defined on Q or R, but not on C.
b) Letn € Z*. For x, y € Z, the modulo n relation R is defined by x R y ifx — yisa
multiple of n. With n = 7, we find, for instance, that 9 R 2, —3 R 11, (14, 0) € R, but
3 R 7 (that is, 3 is not related to 7).
337
338 Chapter 7 Relations: The Second Time Around
c) For the universe U = {1, 2,3, 4,5, 6, 7} consider the (fixed) set C C % where
C = {1, 2, 3, 6}. Define the relation R on POU) by AR B when ANC = BNC.
Then the sets {1, 2, 4, 5} and {1, 2, 5, 7} are related since {1, 2, 4,5} MC = {1, 2} =
{1, 2,5, 7} AC. Likewise we find that X = {4, 5} and Y = {7} are so related because
XMC =%= Y ONC. However,
the sets S = {1, 2, 3,4, 5} and T = {1, 2, 3, 6, 7} are
not related
— that is, SAT —since SOC = (1, 2,3} 4 {1,2,3,6} =TNC.
| EXAMPLE 7.2 Let & be an alphabet, with language A C &*. For x, y € A, define x R y if x is a prefix
of y. Other relations can be defined on A by replacing “prefix” with either “suffix” or “‘sub-
string.”
Consider a finite state machine M = (S, #, ©, v, w).
| EXAMPLE 7.3
a) For s1, so € S, define s; & s2 if v(s}, x) = s, for some x € F. Relation R establishes
the first level of reachability.
b) The relation for the second level of reachability can also be given for S. Here 5; R 5 if
v(s1, X1X2) = $9, for some x1.x2 € ¥. This can be extended to higher levels if the need
arises. For the general reachability relation we have v(s}, y) = s2, for some y € #*.
c) Given s,, 52 € S the relation of /-eguivalence, which is denoted by s; E; sz and is
read “s; is 1-equivalent to s”, is defined when w(s;, x) = w(s2, x) for all x € J.
Consequently, s; E; sz indicates that if machine M starts in either state s; or sz, the
output is the same for each element of #. This idea can be extended to states being
k-equivalent, where we write s; Ex s2 if w(s1, y) = w(s2, y), forall y € $*. Here the
same output string is obtained for each input string in $* if we start at either 51 or sp.
If two states are k-equivalent for all k € Z*, then they are called equivalent. We
shall look further into this idea later in the chapter.
We now start to examine some of the properties a relation can satisfy.
Definition 7.2 Arelation & ona set A is called reflexive if for allx € A, (x, x) ER.
To say that a relation & is reflexive simply means that each element x of A is related
to itself. All the relations in Examples 7.1 and 7.2 are reflexive. The general reachability
relation in Example 7.3(b) and all of the relations mentioned in part (c) of that example
are also reflexive. [What goes wrong with the relations for the first and second levels of
reachability given in parts (a) and (b) of Example 7.37]
For A = {1, 2, 3, 4}, a relation & CA X A will be reflexive if and only if R > {(1, 1),
EXAMPLE 7.4
(2, 2), (3, 3), (4, 4)}. Consequently, R; = {(1, 1), (2, 2), (3, 3)} is not a reflexive relation
on A, whereas Ry = {(x, y)|x, y € A, x < y} is reflexive on A.
| EXAMPLE 7.5 Given a finite set A with |A| =,
How many of these are reflexive?
we have |A X A| =n’, so there are 2” relations on A.
If A = {a}, a2,..., @,}, arelation R on A is reflexive if and only if {(a;, a;)|1 <i <
n} CR. Considering the other n? — n ordered pairs in A X A [those of the form (a;, a;),
71 Relations Revisited: Properties of Relations 339
wherei # j for 1 <i, j <n] as we construct a reflexive relation R on A, we either include
. 2 .
or exclude each of these ordered pairs, so by the rule of product there are 2" ~” reflexive
relations on A.
Definition 7.3 Relation & on set A is called symmetric if (x, y) Ee R => (y, x) ER, forall x, ye A.
With A = {1, 2, 3}, we have:
EXAMPLE 7.6
a) KR, = {(1, 2), (2, 1), 1, 3), G, 1)}, asymmetric, but not reflexive, relation on A;
b) AR, = {(1, 1), (2, 2), (3, 3), (2, 3)}, a reflexive, but not symmetric, relation on A;
c) R3 = {(1, 1), (2, 2), (3, 3)} and Ry = {C1, 1), (2, 2), (3, 3), (2, 3), G, 2)}, two
relations on A that are both reflexive and symmetric; and
d) Rs; = {(1, 1), (2, 3), (3, 3)}, a relation on A that is neither reflexive nor symmetric.
To count the symmetric relations on A = {a}, a@,...,@,}, we write A XA as
A; U Az, where A; = {(a;, a;)|1 <i <n} and A> = {(a;, a;)|1 <1,j7 <n,i # j}, so that
every ordered pair in A X A is inexactly one of A,, Az. For Az, |A2| = |A X A] — |A;| =
n? —n = n(n — 1), an even integer. The set A> contains (1/2)(n* — n) subsets S;; of the
form {(a;, 4;), (a,, a;)} where 1 <i < j <n. Inconstructing a symmetric relation & on A,
for each ordered pair in A; we have our usual choice of exclusion or inclusion. For each of
the (1/2)(n? — n) subsets S, j(1 <i < j <n) taken from A2 we have the same two choices.
So by the rule of product there are 2” . 21/2" -”) = 2(1/2)"+") symmetric relations on A.
In counting those relations on A that are both reflexive and symmetric, we have only
one choice for each ordered pair in A;, So we have 2/2)" relations on A that are both
reflexive and symmetric.
Definition 7.4 For a set A, arelation ® on A is called transitive if, for all x, y,z€ A, (x, vy), GY, DER
=> (x, z) €R. (So ifx “is related to” y, and y “is related to” z, we want x “related to” z,
with y playing the role of “intermediary.’’)
All the relations in Examples 7.1 and 7.2 are transitive, as are the relations in Ex-
EXAMPLE 7.7
ample 7.3(c).
Define the relation R on the set Zt by a Rb if a (exactly) divides b — that is, b = ca for
EXAMPLE 7.8
somec € Zt. Now ifx Ry and y R z, do we have x R z? We know thatx Ry > y =sx
forsome s € Z* and y R z => z = ty wheret € Z*. Consequently, z = ty = t(sx) = (ts)x
for ts € Z*, sox Rz and K is transitive. In addition, K is reflexive, but not symmetric,
because, for example, 2 R 6 but 6 fF 2.
Consider the relation R on the set Z where we define a R b when ab > 0. For all integers
EXAMPLE 7.9
x we have xx = x? > 0,sox Rx and is reflexive. Also, if x, y € Zand x R y, then
xRysxy>O0>s>yx2>0>ayRx,
340 Chapter 7 Relations: The Second Time Around
so the relation & is symmetric as well. However, here we find that (3, 0), (0, -7) Ee R—
since (3)(0) > 0 and (0)(—7) > 0— but (3, —7) ¢ RK because (3)(—7) < 0. Consequently,
this relation is not transitive.
If A = {1, 2, 3, 4}, then R, = {(1, 1), (2, 3), (3, 4), (2, 4)} is a transitive relation on A,
EXAMPLE 7.10
whereas A = {(1, 3), (3, 2)} is not transitive because (1, 3), (3, 2) € Rz but (1, 2) d Ro.
At this point the reader is probably ready to start counting the number of transitive
relations on a finite set. But this is not possible here. For unlike the cases dealing with the
reflexive and symmetric properties, there is no known general formula for the total number
of transitive relations on a finite set. However, at a later point in this chapter we shall have
the necessary ideas to count the relations ® on a finite set, where & is (simultaneously)
reflexive, symmetric, and transitive.
For now we consider one last property for relations.
Definition 7.5 Given a relation ® ona set A, R is called antisymmetric if for all a, b € A, (a Rb and
bR a) => a = b. (Here the only way we can have both a “related to” b and b “related to”
ais ifa and b are one and the same element from A.)
For a given universe U, define the relation R on PU) by (A, B) ER if ACB, for
EXAMPLE 7.11
A, BCU. So &R is the subset relation of Chapter 3 and if A&R B and B RK A, then we have
A C B and B CA, which gives us A = B. Consequently, this relation is antisymmetric, as
well as reflexive and transitive, but it is not symmetric.
Before we are led astray into thinking that “not symmetric” is synonymous with “anti-
symmetric’, let us consider the following.
ForA = {1, 2, 3}, the relation @ on A givenby R = {(1, 2), (2, 1), (2, 3)} is not symmetric
EXAMPLE 7.12
because (3, 2) ¢ RK, and it is not antisymmetric because (1, 2), (2, 1) € R but | ¥ 2. The
relation R, = {(1, 1), (2, 2)} is both symmetric and antisymmetric.
How many relations on A are antisymmetric? Writing
AXA={(, 1), @, 2), GB, 3)}U (G, 2), 2, D, C, 3), B,D, 2, 3), GB, 2)},
we make two observations as we try to construct an antisymmetric relation & on A.
1) Each element (x, x) €¢ A X A can be either included or excluded with no concern
about whether or not & is antisymmetric.
2) For an element of the form (x, y), x # y, we must consider both (x, y) and (y, x)
and we note that for & to remain antisymmetric we have three alternatives: (a) place
(x, y) in R; (b) place (y, x) in &; or (c) place neither (x, y) nor (y, x) in R. [What
happens if we place both (x, y) and (y, x) in R?]
7.1 Relations Revisited: Properties of Relations 341
So by the rule of product, the number of antisymmetric relations on A is (23)(3°) =
(23)(3-3)/2)_ If [A] =n > O, then there ate (2”)(3”'~”?/2) antisymmetric relations on A.
For our next example we return to the concept of function dominance, which we first
defined in Section 5.7.
EXAMPLE 7.13 _| Let # denote the set of all functions with domain Z* and codomain
{flf: Z* — R}. For f, g € &, define the relation R on ¥ by f KR g if f is dominated by g
R; that is, # =
(or f € O(g)). Then & is reflexive and transitive.
If f, g: Z* — Rare defined by f(n) =n and g(n) =n +5, then f Reg and g R f but
f #8, 80 R is not antisymmetric. In addition, if h: Z* > R is given by h(n) = n?, then
(f, h), (g, A) € KR, but neither (h, f) nor (A, g) is in KR. Consequently, the relation R is
also not symmetric.
At this point we have seen the four major properties that arise in the study of relations.
Before closing this section we define two more notions, each of which involves three of
these four properties.
Definition 7.6 A relation & on a set A is called a partial order, or a partial ordering relation, if R is
reflexive, antisymmetric, and transitive.
The relation in Example 7.1 (a) is a partial order, but the relation in part (b) of that example
EXAMPLE 7.14
is not because it is not antisymmetric. All the relations of Example 7.2 are partial orders, as
is the subset relation of Example 7.11.
Our next example provides us with the opportunity to relate this new idea of a partial
order with results we studied in Chapters 1 and 4.
We start with the set A = {1, 2, 3, 4, 6, 12}— the set of positive integer divisors of 12 —
EXAMPLE 7.15
and define the relation R on A by x K y if x (exactly) divides y. As in Example 7.8 we
find that & is reflexive and transitive. In addition, if x, y ¢ A and we have both x & y and
y Rx, then
xR y => y = ax, forsomea ¢ Z*, and
yRx=>x = by, forsomebe Z.
Consequently, it follows that y = ax = a(by) = (ab)y, and since y # 0, we have ab = I.
Because a, be Zt,ab=1>a=b=1,s0 y=x and & is antisymmetric
— hence it
defines a partial order for the set A.
Now suppose we wish to know how many ordered pairs occur in this relation &. We
may simply list the ordered pairs from A X A that comprise &:
R = {0, 1), d, 2), d, 3), dd, 4, C1, 6), C1, 12), (2, 2), (2, 4), (2, 6),
(2, 12), (3, 3), (3, 6), (3, 12), (4, 4), (4, 12), (6, 6), (6, 12), (12, 12)}
In this way we learn that there are 18 ordered pairs in the relation. But if we then wanted to
consider the same type of partial order for the set of positive integer divisors of 1800, we
should definitely be discouraged by this method of simply /isting all the ordered pairs. So
342 Chapter 7 Relations: The Second Time Around
let us examine the relation & a little closer. By the Fundamental Theorem of Arithmetic we
may write 12 = 2? - 3 and then realize that if (c, d) € R, then
c=2".3" and d=2?. 37,
where m,n, p,geNwithO<m<p<2and0<n<q<l.
When we consider the fact that 0 <m < p <2, we find that each possibility for m, p
is simply a selection of size 2 from a set of size 3——namely, the set {0, 1, 2} -— where
repetitions are allowed. (In any such selection, if there is a smaller nonnegative integer,
then it is assigned to m.) In Chapter 1 we learned that such a selection can be made in
(°*+5~') = () = 6 ways. And, in like manner, n and g can be selected in (**5~') = 3) =
3 ways. So by the rule of product there should be (6)(3) = 18 ordered pairs in R —as we
found earlier by actually listing all of them.
Now suppose we examine a similar situation, the set of positive integer divisors of
1800 = 23 . 3° . 5*. Here we are dealing with (3 + 1)(2 + 1)(2 + 1) = (4)(3)(3) = 36 divi-
sors, and a typical ordered pair for this partial order (given by division) looks like (2” - 3° - 5’,
2” .3".5”), where r,s,t,u,v,weEN with O<r<u<3,0<s5
<v <2, and Q<r<
w < 2. So the number of ordered pairs in the relation is
E)C
Y= QE
EYEPI rere = s00
-aoE
and we definitely should not want to have to list all of the ordered pairs in the relation in
order to obtain this result.
In general, forn € Zt with n > 1, use the Fundamental Theorem of Arithmetic to write
n= pj! psp; +--+ py’, where k € Z*, p < po < p3 <--+ < pg, and p; is prime and e; €
Z* for each 1 <i <k. Then n has Hk, (e; + 1) positive integer divisors. And when we
consider the same type of partial order for this set (of positive integer divisors of 2), we
find that the number of ordered pairs in the relation is
1 (rrr t) _ I (“ 37)
2 i=] 2
In closing this section we introduce the equivalence relation—a concept that is very
important in the study of mathematics.
Definition 7.7 An equivalence relation R on a set A is a relation that is reflexive, symmetric, and transi-
tive.
a) The relation in Example 7.1(b) and all the relations in Example 7.3(c) are equivalence
EXAMPLE 7.16
relations.
b) IfA = {1, 2, 3}, then
Ry {(1, 1), (2, 2), (3, 3)},
Ry {(, 1), (2, 2), (2, 3), (3, 2), 3, 3)},
Rs {(1, 1), 1, 3), (2, 2), (3, 1), G, 3)}, and
Ry = {, 1), C1, 2), Cd, 3), (2, 1), (2, 2), (2, 3), 3, 1), GB, 2), B,3)) = AXA
are all equivalence relations on A.
c) For a given finite set A, A X A is the largest equivalence relation on A, and if A =
{a,, a2,..., Ga}, then the equality relation R = {(a;, a;)|1 <i <n} is the smallest
equivalence relation on A.
7.1 Relations Revisited: Properties of Relations 343
d) Let A = {1, 2,3, 4,5, 6, 7}, B = {x, y, z}, and f: A — B be the onto function
f ={d, x), (2, 2), 3, x), 4 y), G, 2), 6, y), (7, xD}.
Define the relation 2 on A by aR b if f(a) = f(b). Then, for instance, we find
here that]R1,1R3,2AR5,3R1,and4AR6.
For each a € A, f(a) = f(a) because f is a function—soa & a, and & is reflex-
ive. Now suppose thata,b<¢ A anda&b. ThenaRb= f(a) = f(b) > f(b) =
f(a) >bRa, so R is symmetric. Finally, if a,b,c e A with aRKb and bRe,
then f(a) = f(b) and f(b) = f(c). Consequently, f(a) = f(c), and we see that
(akbAbRe) >aRkc. So R is transitive. Since R is reflexive, symmetric, and
transitive, it is an equivalence relation.
Here = {(1, 1), C1, 3), C1, 7), (2, 2), (2, 5), 3, 1), GB, 3), GB, 7), (4, 4, (4, 6),
(5, 2), 5, 5), (6, 4), (6, 6), (7, 1), (7, 3), (7, 7}.
e) If R is a relation ona set A, then & is both an equivalence relation and a partial order
on A if and only if & is the equality relation on A.
e) R is the relation on Z where x KR y ifx + y is odd.
EXERCISES 7.1
f) KR is the relation on Z where x & y ifx — y is even.
1. If A = (1, 2, 3, 4}, give an example of a relation ® on A g) Let T be the set of all triangles in R’. Define R on T by
that is t; Rt if t; and t, have an angle of the same measure.
a) reflexive and symmetric, but not transitive h) Ris the relationon Z X Z where (a, b)R(c, d)ifa <c.
b) reflexive and transitive, but not symmetric [Note: R C (ZX Z) X (ZX Z).)
6. Which relations in Exercise 5 are partial orders? Which are
c) symmetric and transitive, but not reflexive
equivalence relations?
2. For relation (b) in Example 7.1, determine five values ofx
7. Let R,, Ry be relations on a set A. (a) Prove or disprove
for which (x, 5) € R.
that 2), R2 reflexive => KR, MR reflexive. (b) Answer part (a)
3. For the relation & in Example 7.13, let f: Z* —+ R where when each occurrence of “reflexive” is replaced by (i) symmet-
f(y =n. ric; (ii) antisymmetric; and (iii) transitive.
a) Find three elements f;. fo. f; € ¥ such that f, R f and 8. Answer Exercise 7, replacing each occurrence of M by U.
FR, foralli<i <3.
9. For each of the following statements about relations on a
b) Find three elements g), 22, 23 € & such that g, R f but set A, where |A| = n, determine whether the statement is true
f#Rg,, forall 1 <i <3. or false. If it is false, give a counterexample.
4, a) Rephrase the definitions for the reflexive, symmetric,
a) If Ris a relation on A and |R| > n, then & is reflexive.
transitive, and antisymmetric properties of a relation & (on
b) If R,, A> are relations on A and RM, D Ry, then Ry,
a set A), using quantifiers.
reflexive (symmetric, antisymmetric, transitive) => QR, re-
b) Use the results of part (a) to specify when a relation & flexive (symmetric, antisymmetric, transitive).
(on a set A) is (i) not reflexive; (ii) not symmetric; (ii1) not
c) If R,, R» are relations on A and Rz DR, then Rz
transitive; and (iv) not antisymmetric.
reflexive (symmetric, antisymmetric, transitive) => W, re-
5. For each of the following relations, determine whether the flexive (symmetric, antisymmetric, transitive).
relation is reflexive, symmetric, antisymmetric, or transitive.
d) If Ris an equivalence relation on A, thenn < |R| < n°.
a) RCZ* XZ where a K b if alb (read “a divides b,”
10. If A = {w. x, y, z}, determine the number of relations on
as defined in Section 4.3).
A that are (a) reflexive; (b) symmetric; (c) reflexive and sym-
b) & is the relation on Z where a & b if alb. metric; (d) reflexive and contain (x, y); (e) symmetric and con-
c) Fora given universe U and a fixed subset C of U, define tain (x, y); (f) antisymmetric; (g) antisymmetric and contain
R on PU) as follows: For A, B CU we have AR B if (x, y);(h) symmetric and antisymmetric; and (1) reflexive, sym-
ANC=BNC. metric, and antisymmetric.
d) On the set A of all lines in R’, define the relation & for 11. Let n € Z* with n > 1, and let A be the set of positive in-
two lines €), €2 by £; R €2 if €, is perpendicular to @). teger divisors of n. Define the relation R on A by x RK y if x
344 Chapter 7 Relations: The Second Time Around
(exactly) divides y. Determine how many ordered pairs are in a) Give an example of a relation R on Z where MR is ir-
the relation & when n is (a) 10; (b) 20; (c) 40; (d) 200; (e) 210; reflexive and transitive but not symmetric.
and (f) 13860. b) Let & be a nonempty relation ona set A. Prove that if R
12. Suppose that p;, p2, p3 are distinct primes and that n, k € satisfies any two of the following properties — irreflexive,
Z* with n = p} p;p§. Let A be the set of positive integer divi- symmetric, and transitive — then it cannot satisfy the third.
sors of n and define the relation R on A by x KR yif x (exactly) c) If |A| =” > 1, how many different relations on A are
divides y. If there are 5880 ordered pairs in &, determine k irreflexive? How many are neither reflexive nor irreflexive?
and |A|.
17. Let A = {1, 2, 3, 4,5, 6,7}. How many symmetric rela-
13. What is wrong with the following argument? tions on A contain exactly (a) four ordered pairs? (b) five or-
Let A bea set with &% a relation on A. If R is symmetric and dered pairs? (c) seven ordered pairs? (d) eight ordered pairs?
transitive, then 2 is reflexive.
18. a) Let f: A— B, where |A| = 25, B = {x, y, z}, and
Proof: Let (x, y) € R. By the symmetric property, (y, x) €
R. Then with (x, y). (vy, x) € R, it follows by the transitive |f-'(&)| = 10, | f-'G)| = 10, | f-"(@)| = 5. If we define
the relation
2 on A bya R bifa, be Aand f(a) = f(b),
property that (x, x) € R. Consequently, RK is reflexive.
how many ordered pairs are there in the relation R?
14, Let A be a set with |A| =n, and let ® be a relation on
b) For n, 1, 12,3, 14 € Z*, let f: A> B, where
A that is antisymmetric. What is the maximum value for |R|?
How many antisymmetric relations can have this size? |Al =n, B={w, x,y, 2}, |f7'(w)| =, |F 7 @)I = ro.
If 'O)| = 13, |f7'(2)| = ng, and ny, +2 +713 +74 = 7.
15. Let A be a set with |A| = n, and let # be an equivalence If we define the relation R on A by aRb if a,beA
relation on A with |R| = r. Why is r — n always even? and f(a) = f(b), how many ordered pairs are there in the
16. A relation & on a set A is called irreflexive if for all a € relation R?
A, (a, a) €R,
7.2
Computer Recognition: Zero-One Matrices
and Directed Graphs
Since our interest in relations is focused on those for finite sets, we are concerned with ways
of representing such relations so that the properties of Section 7.1 can be easily verified. For
this reason we now develop the necessary tools: relation composition, zero-one matrices,
and directed graphs.
In a manner analogous to the composition of functions, relations can be combined in the
following circumstances.
Definition 7.8 If A, B, and C are sets with R; CA X B and KR, C BX C, then the composite relation
R, oR, is a relation from A to C defined by R, oR» = {(x, z)|x € A, z € C, and there
exists y € B with (x, y) € Ry, (y, z) € Ry}.
Beware! The composition of two relations is written in an order opposite to that for
function composition. We shall see why in Example 7.21.
Let A = {1, 2,3, 4}, B = {w, x, y, z}, and C = {5, 6, 7}. Consider R, = {(1, x), (2, x),
: EXAMPLE 7.17 (3, y), (3, z)}, a relation from A to B, and R> = {(w, 5), (x, 6)}, a relation from B to
C. Then Ry o Kz = {(1, 6), (2, 6)} is a relation from A to C. If R3 = {(w, 5), (w, 6)} is
another relation from B to C, then R; o R3 = G.
7.2 Computer Recognition: Zero-One Matrices and Directed Graphs 345
Let A be the set of employees at a computing center, while B denotes a set of high-level
EXAMPLE 7.18
programming languages, and C is a set of projects {p), p2,..., ps} for which managers
must make work assignments using the people in A. Consider &, C A X B, where an or-
dered pair of the form (L. Alldredge, Java) indicates that employee L. Alldredge is proficient
in Java (and perhaps other programming languages). The relation R2 C B X C consists of
ordered pairs such as (Java, pz), indicating that Java is considered an essential language
needed by anyone who works on project p2. In the composite relation R, oR» we find
(L. Alldredge, p2). If no other ordered pair in Ry has p2 as its second component, we know
that if L. Alldredge was assigned to p> it was solely on the basis of his proficiency in Java.
(Here &; o RM, has been used to set up a matching process between employees and projects
on the basis of employee knowledge of specific programming languages.)
Comparable to the associative law for function composition, the following result holds
for relations.
THEOREM 7.1 Let A, B, C, and D be sets with R; CA X B, Ro CBXC, and R3 CC X D. Then
Rio (Ry o Kz) = (Ry; o Ry) o Rs.
Proof: Since both R, o (Ro R3) and (RK; oR») oR; are relations from A to D, there
is some reason to believe they are equal. If (a, d) € R; o (R2 o W3), then there is an
element be B with (a,b) eR, and (b, d) € (Rp o V3). Also, (b, d) € (Rp 0 R3) |
(b,c) ER, and (c,d) eR for some ceC. Then (a,b)eR, and (b,ch)e Roa
(a,c) ER, o Ry. Finally, (a,c) eR; o R2 and (c, d) E R3 = (a, d) € (RK, o Ry) o Rs,
and R, o (Rz o R3) C (RK; oR2) oR. The opposite inclusion follows by similar rea-
soning.
As a result of this theorem no ambiguity arises when we write R, o Ry o Rs for either
of the relations in Theorern 7.1. In addition, we can now define the powers of a relation R
on a set.
Definition 7.9 Given aset A and arelation ® on A, we define the powers of & recursively by (a) R' = KR;
and (b) forn € Z7, KR"! = RoR",
Note that for n € Z*, R” is arelation on A.
IfA = {1, 2, 3, 4}and&R = {(1, 2), (1, 3), (2, 4), G, 2)}, then KR? = {(1, 4), (1, 2), B, 4},
EXAMPLE 7.19
R> = {(1, 4)}, and forn > 4, R" = G.
As the set A and the relation & on A grow larger, calculations such as those in Example
7.19 become tedious. To avoid this tedium, the tool we need is the computer, once a way
can be found to tell the machine about the set A and the relation % on A.
Definition 7.10 An m X n zero-one matrix E = (€,;)mxn is a rectangular array of numbers arranged in m
rows and n columns, where each e,,, for 1 <i <m and 1 < j <n, denotes the entry in the
ith row and jth column of E, and each such entry is 0 or 1. [We can also write (0, 1)-matrix
for this type of matrix. ]
346 Chapter 7 Relations: The Second Time Around
The matrix
EXAMPLE 7.20
1 0 0 1
E=/]0 1 0 1
100 0
isa3 x 4 (0, 1)-matrix where, for example, e;; = 1, e23 = 0, and e3; = 1.
In working with these matrices, we use the standard operations of matrix addition and
multiplication with the stipulation that 1 + 1 = 1. (Hence the addition is called Boolean.)
Consider the sets A, B, and C and the relations &,, Ft. of Example 7.17. With the orders
EXAMPLE 7.21
of the elements in A, B, and C fixed as in that example, we define the relation matrices for
Ri, R» as follows:
(w) (x) (y) (©) (5) (6) (7)
(1)| 0 1 0 0 (w)| ] 0 0
M(R1) = (2)} O 1 0 0 M(R2) = (x) | O 1 0
(3)| 0 0 1 1 |’ (y) | 0 0 0
(4)| 0 0 0 0 (z) | 0 0 0
In constructing M(R,), we are dealing with a relation from A to B, so the elements of A
are used to mark the rows of M (22) and the elements of B designate the columns. Then to
denote, for example, that (2, x) € Ry, we place a 1 in the row marked (2) and the column
marked (x). Each 0 in this matrix indicates an ordered pair in A X B that is missing from
R,. For example, since (3, w) ¢ Rj, there is a 0 for the entry in row (3) and column (w)
of the matrix M(R,). The same process is used to obtain M(R).
Multiplying these matrices,’ we find that
(5) (6) (7)
0 1 0 0 1 0 0 (1)} 0 ] 0
0 1 0 0 0 1 0 2)| 0 1 0
0 0 0 0 0 0 0 (4)|0 O QO
where the rows of the 4 X 3 matrix M(R, o R2) are marked by the elements of A while its
columns are marked by the elements of C. In general we have: If &, is a relation from A
to B and & is a relation from B to C, then M(R,) - M(R2) = M(R, o KR»). That is, the
product of the relation matrices for R,, Ro, in that order, equals the relation matrix of the
composite relation R; o Ry. (This is why the composition of two relations was written in
the order specified in Definition 7.8.)
The reader will be asked to prove the general result of Example 7.21, along with some
results from our next example, in Exercises 11 and 12 at the end of this section.
Further properties of relation matrices are exhibited in the following example.
"The reader who is not familiar with matrix multiplication or simply wishes a brief review should consult
Appendix 2.
7.2 Computer Recognition: Zero-One Matrices and Directed Graphs 347
| EXAMPLE 7.22 Let A = {1, 2, 3, 4} and R = {(1, 2), C1, 3), (2, 4), (3, 2)}, as in Example 7.19. Keeping
the order of the elements in A fixed, we define the relation matrix for R as follows: M(R)
is the 4 x 4 (Q, 1)-matrix whose entries m,,, for 1 <i, j < 4, are given by
m= 1, ifG, peR,
J 0, otherwise.
In this case we find that
1
O&
cor
oor &
0
MCR) = 1
ooCc
Qo
0
Now how can this be of any use? If we compute (M@(&))” using the convention that
1+ 1 = 1, then
we find that
0 10 1
7_|0 0 0 0
0 0 0 0
which happens to be the relation matrix for R oR = R*. (Check Example 7.19.) Further-
more,
0 0 0 0
4_}]0 0 0 0
000 0
which is also the relation matrix for the relation R* — that is, (M(R))* = M(R?4). Also,
recall that R* = G, as we learned in Example 7.19.
What has happened here carries over to the general situation. We now state some results
about relation matrices and their use in studying relations.
Let A be a set with {A] = and & a relation on A. If M(R) is the relation matrix for
R, then:
a) M(R) = 6 (the matrix of all 0’s) if and only if R = 9
b) M (St) = 1 (the matrix of all 1’s) if and only if®@ = AXA
c) M(R™) = (M(R)}", form « Zt
Using the (0, 1)-matrix for a relation, we now turn to the recognition of the reflex-
ive, symmetric, antisymmetric, and transitive properties. To accomplish this we need the
concepts introduced in the following three definitions.
Definition 7.11 Let E = (€,))mxn, F = (fi;)mxn be two m X n (QO, 1)-matrices. We say that E precedes, or
is less than, F, and we write E < F ife,, < f,,,foralll <i<m,1l<j<n.
348 Chapter 7 Relations: The Second Time Around
. —;/1 0 1 _ fi
with £ =| 0 i] ana P= 0 | we have E < F. In fact, there are eight
EXAMPLE 7.23
(0, 1)-matrices G for which E <G.,
Definition 7.12 Forn € Z*, I, = (6ij)nxn is the n X n (O, 1)-matrix where
s afl ifiss
U10, ifi Fy.
Definition 7.13 Let A = (4ij)mxn be a (OQ, 1)-matrix. The transpose of A, written A" is the matrix (Qi nxm
where aj; =a;;,foralll <j <n, 1<i<m.
0 1
ForA=1/0 0O we find that A" = | 0 |
EXAMPLE 7.24 1 1 0 1
As this example demonstrates, the 7th row (column) of A equals the ith column (row)
of A". This indicates a method we can use in order to obtain the matrix A" from the
matrix A.
THEOREM 7.2 Given a set A with |A| = n and arelation R on A, let M denote the relation matrix for R.
Then
a) R&R is reflexive if and only if J, < M.
b) KR is symmetric if and only if M = M".
c) &R is transitive if and only if M-M = M? <M.
d) KR is antisymmetric if and only if MMM" < J,. (The matrix MMM" is formed
by operating on corresponding entries in M and M" according to the rules ON 0 =
0N1=1N0=Oand1M1 = 1—thatis, the usual multiplication for 0’s and /or 1’s.)
Proof: The results follow from the definitions of the relation properties and the (0, 1)-matrix.
We demonstrate this for part (c), using the elements of A to designate the rows and columns
in M, as in Examples 7.21 and 7.22.
Let M? < M. If (x, y), (y, z) €&, then there are 1’s in row (x), column (y) and in
row (y), column (z) of M. Consequently, in row (x), column (z) of M? there is a 1. This 1
must also occur in row (x), column (z) of M because M? < M. Hence (x, z) € Rand Ris
transitive.
Conversely, if & is transitive and M is the relation matrix for R, let s,, be the entry in
row (x) and column (z) of M?, with s,, = 1. For s,, to equal 1 in M?, there must exist at
least one y € A where m,, = my, = 1 in M. This happens only if x & y and y R z. With
RR transitive, it then follows that x R z. So m,, = land M* < M.
The proofs of the remaining parts are left to the reader.
The relation matrix is a useful tool for the computer recognition of certain properties
of relations. Storing information as described here, this matrix is an example of a data
7.2 Computer Recognition: Zero-One Matrices and Directed Graphs 349
structure. Also of interest is how the relation matrix is used in the study of graph theory"
and how graph theory is used in the recognition of certain properties of relations.
At this point we shall introduce some fundamental concepts in graph theory. Often these
concepts will be given within examples and not in terms of formal definitions. In Chapter 11,
however, the presentation will not assume what is given here and will be more rigorous and
comprehensive.
Definition 7.14 Let V be a finite nonempty set. A directed graph (or digraph) G on V is made up of the
elements of V, called the vertices or nodes of G, and a subset E, of V X V, that contains
the (directed) edges, or arcs, of G. The set V is called the vertex set of G, and the set E is
called the edge set. We then write G = (V, £) to denote the graph.
If a,b eV and (a, b) € E*, then there is an edge from a to b. Vertex a is called the
origin or source of the edge, with b the terminus, or terminating vertex, and we say that b
is adjacent from a and that a is adjacent to b. In addition, if a # b, then (a, b) # (b, a). An
edge of the form (a, a) 1s called a loop (at a).
For V = {1, 2, 3, 4, 5}, the diagram in Fig. 7.1 is a directed graph G on V with edge set
EXAMPLE 7.25
{(, 1), (1, 2), C1, 4), (G, 2)}. Vertex 5 is a part of this graph even though it 1s not the origin
or terminus of an edge. It is referred to as an isolated vertex. As we see here, edges need
not be straight line segments, and there is no concern about the length of an edge.
4 ° (a) (b)
Figure 7.1 Figure 7.2
When we develop a flowchart to study a computer program or algorithm, we deal with
a special type of directed graph where the shapes of the vertices may be important in the
analysis of the algorithm. Road maps are directed graphs, where the cities and towns are
represented by vertices and the highways linking any two localities are given by edges. In
road maps, an edge is often directed in both directions. Consequently, if G is a directed
graph anda, b € V, witha # b, and both (a, b), (b, a) € E, then the single undirected edge
{a, b} = {b, a} in Fig. 7.2(b) is used to represent the two directed edges shown in Fig. 7.2(a).
In this case, a and b are called adjacent vertices. (Directions may also be disregarded for
loops.)
* Since the terminology of graph theory is not standardized, the reader may find some differences between
definitions given here and in other texts.
*In this chapter we allow only one edge from a to b. Situations where multiple edges occur are called
multigraphs. These are discussed in Chapter 11.
350 Chapter 7 Relations: The Second Time Around
Directed graphs play an important role in many situations in computer science. The
following example demonstrates one of these.
Computer programs can be processed more rapidly when certain statements in the program
EXAMPLE 7.26
are executed concurrently. But in order to accomplish this we must be aware of the de-
pendence of some statements on earlier statements in the program. For we cannot execute
a statement that needs results from other statements— statements that have not yet been
executed.
In Fig. 7.3(a) we have eight assignment statements that constitute the beginning of
a computer program. We represent these statements by the eight corresponding vertices
S}, 82, 53, ..., Sg in part (b) of the figure, where a directed edge such as (s;, s5) indicates
that statement ss; cannot be executed until statement s; has been executed. The resulting
directed graph is called the precedence graph for the given lines of the computer program.
Note how this graph indicates, for example, that statement s7 cannot be executed until after
each of the statements 5), 52, 53, and s4 has been executed. Also, we see how a statement such
as Ss; must be executed before it is possible to execute any of the statements 52, 54, 85, 57, OF
sg. In general, if a vertex (statement) s is adjacent from m other vertices (and no others), then
the corresponding statements for these vertices must be executed before statement s can
be executed. Similarly, should a vertex (statement) s be adjacent to n other vertices, then
each of the corresponding statements for these vertices requires the execution of statement
s before it can be executed. Finally, from the precedence graph we see that the statements
51, 83, and s¢ can be processed concurrently. Following this, the statements 52, s4, and sg
can be executed at the same time, and then the statements s5 and s7. (Or we could process
statements s2 and s4 concurrently, and then the statements s5, 57, and sg.)
Ss S7
(s1) Bb i= 3
(So) Ci= b+2
(s3) @a:= |
(4) d= a*b4+5 $8
(ss) e:= d-1
(ss) f t= 7
(s7) ec= ctd
(ss) g i= b*f 53 5, 56
(a) (b)
Figure 7.3
Now we want to consider how relations and directed graphs are interrelated. For a start,
given a set A and a relation & on A, we can construct a directed graph G with vertex set A
and edge set E C A X A, where (a, b) € E ifa, be A anda & b. This its demonstrated in
the following example.
For A = {1, 2, 3, 4}, let AR = {c1, 1), C1, 2), (2, 3), (3, 2), GB, 3), GB, 4), (4, 2)} be a rela-
EXAMPLE 7.27
tion on A. The directed graph associated with & is shown in Fig. 7.4(a), where the undirected
edge {2, 3}(= {3, 2}) is used in place of the pair of distinct directed edges (2, 3) and (3, 2).
If the directions in Fig. 7.4(a) are ignored, we get the associated undirected graph shown in
7.2 Computer Recognition: Zero-One Matrices and Directed Graphs 351
part (b) of the figure. Here we see that the graph is connected in the sense that for any two
vertices x, y, with x # y, there is a path starting at x and ending at y. Such a path consists
of a finite sequence of undirected edges, so the edges {1, 2}, {2, 4} provide a path from 1 to
4, and the edges {3, 4}, {4, 2}, and {2, 1} provide a path from 3 to 1. The sequence of edges
{3, 4}, {4, 2}, and {2, 3} provides a path from 3 to 3. Such a closed path is called a cycle.
This is an example of an undirected cycle of /ength 3, because it has three edges in it.
(a) (b) (c) (d)
Figure 7.4
When we are dealing with paths (in both directed and undirected graphs), no vertex
may be repeated. Therefore, the sequence of edges {a, b}, {b. e}, {e, f}, (f, b}, {b, d} in
Fig. 7.4(c) is not considered to be a path (from a to d) because we pass through the vertex b
more than once. In the case of cycles, the path starts and terminates at the same vertex and has
at least three edges. In Fig. 7.4(d) the sequence of edges (b, f), (f, e), (e, a), (d, c), (c, b)
provides a directed cycle of length 5. The six edges (b, f), (f, e), (e, b), (b, d), (d, oc),
(c, b) do not yield a directed cycle in the figure because of the repetition of vertex b. If their
directions are ignored, the corresponding six edges, in part (c) of the figure, likewise pass
through vertex b more than once. Consequently, these edges are not considered to form a
cycle for the undirected graph in Fig. 7.4(c).
Now since we require a cycle to have length at least 3, we shall not consider loops to be
cycles, We also note that loops have no bearing on graph connectivity.
We choose to define the next idea formally because of its relevance to what we did earlier
in Section 6.3.
Definition 7.15 A directed graph G on V is called strongly connected if for all x, y € V, where x # y,
there is a path (in G) of directed edges from x to y — that is, either the directed edge (x, y)
is in G or, for some n € Z* and distinct vertices vj, v2,..., U, € V, the directed edges
(x, v,), (Vy, V2),..-, (Un, y) are in G,
It is in this sense that we talked about strongly connected machines in Chapter 6. The
graph in Fig. 7.4(a) is connected but not strongly connected. For example, there is no
directed path from 3 to 1. In Fig. 7.5 the directed graph on V = {1, 2, 3, 4} is strongly
connected and loop-free. This is also true of the directed graph in Fig. 7.4(d).
352 Chapter 7 Relations: The Second Time Around
OO) 1 2
e
1
;
2
4 3 4
(R,) (R,)
4
Figure 7.5 Figure 7.6
For A = {1, 2, 3, 4}, consider the relations R&, = {(1, 1), (1, 2), (2, 1), (2, 2), (3, 3),
EXAMPLE 7.28
(3, 4), (4, 3), (4, 4)} and R2 = {(2, 4), (2, 3), 3, 2), GB, 3), GB, 4}. As Fig. 7.6 illustrates,
the graphs of these relations are disconnected. However, each graph is the union of two
connected pieces called the components of the graph. For 2%, the graph is made up of two
strongly connected components. For #2, one component consists of an isolated vertex, and
the other component is connected but not strongly connected.
The graphs in Fig. 7.7 are examples of undirected graphs that are loop-free and have an
EXAMPLE 7.29
edge for every pair of distinct vertices. These graphs illustrate the complete graphs on n
vertices which are denoted by K,,. In Fig. 7.7 we have examples of the complete graphs on
three, four, and five vertices, respectively. The complete graph K> consists of two vertices
x, y and an edge connecting them, whereas the complete graph K consists of one vertex
and no edges because loops are not allowed.
1
(K3) (K4) (Ks)
Figure 7.7
In this drawing of Ks two edges cross, namely, {3, 5} and {1, 4}. However, there is
no point of intersection creating a new vertex. If we try to avoid the crossing of edges by
drawing the graph differently, we run into the same problem all over again. This difficulty
will be examined in Chapter 11 when we deal with the planarity of graphs.
A digraph G on a vertex set V gives rise to a relation ® on V where x R y if (x, y) is an
edge in G. Consequently, there is a (0, 1)-matrix for G, and since this relation matrix comes
about from the adjacencies of pairs of vertices, it is referred to as the adjacency matrix for
G as well as the relation matrix for R.
7.2 Computer Recognition: Zero-One Matrices and Directed Graphs 353
At this point we tie together the properties of relations and the structure of directed
graphs.
If A = {1, 2, 3} and R = {(1, 1), (1, 2), (2, 2), 3, 3), (3, 1}, then & is a reflexive an-
EXAMPLE 7.30 tisymmetric relation on A, but it is neither symmetric nor transitive. The directed graph
associated with & consists of five edges. Three of these edges are loops that result from the
reflexive property of &. (See Fig. 7.8.) In general, if & is a relation on a finite set A, then
R is reflexive if and only if its directed graph contains a loop at each vertex (element of A).
The relation R = {(1, 1), C1, 2), (2, 1), (2, 3), G, 2)} is symmetric on A = {1, 2, 3}, but
EXAMPLE 7.31
it is not reflexive, antisymmetric, or transitive. The directed graph for & is found in
Fig. 7.9. In general, a relation & on a finite set A is symmetric if and only if its directed
graph may be drawn so that it contains only loops and undirected edges.
For A = {1, 2, 3}, considerR = {(1, 1), 1, 2), (2, 3), (1, 3)}. The directed graph forR is
EXAMPLE 7.32
shown in Fig. 7.10. Here & is transitive and antisymmetric but not reflexive or symmetric.
The directed graph indicates that a relation on a set A is transitive if and only if it satisfies
the following: For all x, y € A, if there is a (directed) path from x to y in the associated
graph, then there is an edge (x, y) also. [Here (1, 2), (2, 3) is a (directed) path from 1 to 3,
and we also have the edge (1, 3) for transitivity.] Notice that the directed graph in Fig. 7.3
of Example 7.26 also has this property.
The relation & is antisymmetric because there are no ordered pairs in & of the form
(x, y) and (y, x} with x # y. To use the directed graph of Fig. 7.10 to characterize anti-
symmetry, we observe that for any two vertices x, y, with x # y, the graph contains at most
one of the edges (x, y) or (y, x). Hence there are no undirected edges aside from loops.
Figure 7.8 Figure 7.9 Figure 7.10
Our final example deals with equivalence relations.
For A = {1, 2, 3, 4, 5}, the following are equivalence relations on A:
EXAMPLE 7.33
Ry = {C, 1), C, 2), 2, 1), (2, 2), GB, 3), GB, 4), G, 3), 4, 4), 6, 5)},
Az = {C, 1), A, 2), C1, 3), 2, 1), @, 2), (2, 3), G1), G, 2), G3),
(4, 4), (4, 5), 5, 4), 6, 5}.
Their associated graphs are shown in Fig. 7.11. If we ignore the loops in each graph, we
find the graph decomposed into components such as K,, Kz, and K3. In general, a relation
on a finite set A is an equivalence relation if and only if its associated graph is one complete
354 Chapter 7 Relations: The Second Time Around
graph augmented by loops at every vertex or consists of the disjoint union of complete
graphs augmented by loops at every vertex.
Fy Ra
Figure 7.11
1 0 1 1
EXERCISES 7.2 10.1f E=| 0 0 0 1 |, how many (0, 1)-matrices F
1 0 0 0
1. For A = {1, 2, 3, 4}, let R and & be the relations on A satisfy E < F? How many (0, 1)-matrices G satisfy G < E?
defined byR = {(1, 2), C1, 3), (2, 4), (4, 4} andF = {, 1),
11. Consider the sets A = {a), a2,.... Gy}, B = {b, bz, ...,
(i, 2), Cl. 3), (2, 3), (2, 4}. Find Roof, FoR, R?, R3, F?, b,}, and C = {c), C2, ..., Cp}, where the elements in each set
and ¥?, remain fixed in the order given here. Let &, be a relation from
2. If R is a reflexive relation on a set A, prove that R? is also Ato B, and let R> be arelation from B to C. The relation matrix
reflexive on A. for KR, is M(R,), where i = 1, 2. The rows and columns of these
3. Provide a proof for the opposite inclusion in Theorem 7.1. matrices are indexed by the elements from the appropriate sets
A, B, and C according to the orders already prescribed. The
4. Let A= {1, 2,3}, B={w, x, y, z}, and C = {4, 5, 6}.
matrix for R, oR, is the m X p matrix M(R, oR2), where
Define the relations A) CA X BLR.CBXC, and RC
the elements of A (in the order given) index the rows and the
BXC, where &, = {(1. w), 3, w), 2, x), C1, y)}, Ro =
elements of C (also in the order given) index the columns.
{(w, 5), (x, 6), (y, 4), Gy, )}, and = Rs = {(w, 4), (w, 5),
Show that for all 1 <i < mandi <j < p, the entries in the
(y. 5)}. (a) Determine %, 0 (W2UR3) and (RM; oR) VU
ith row and jth column of M(R,)- M(Ry) and M(R, o Rp)
(RK, oR3). (b) Determine KR, o (R2NAR3) and (RM, oR.) N
are equal. [Hence M(R) - M(R2) = M(R, o R2).)
(R; oR).
12. Let A be a set with |A] =n, and consider the order for
5. Let A = {1, 2}, B = {m,n, p}, and C = (3, 4}. Define the
the listing of its elements as fixed. For R C A X A, let M(R)
relations Ry; CA XB, Ry CBXC, and RzCBXC by
denote the corresponding relation matrix.
R, = {C,m), 1,2), 4, p)}, Re = ((m, 3), Om, 4). (p, Y},
and R; = {(m, 3), (m, 4), (p, 3)}. Determine R, o (R2 NAVs) a) Prove that M(R) = 0 (then X n matrix of all 0’s) if and
and (R; o Rr) N(R; o Ks). only ifR = G.
6. For sets A, B, and C, consider relations R,; C A X B, b) Prove that M(R) = 1 (then X n matrix of all 1’s) if and
Ry Cc BX C, and Cc B XC. Prove that (a) Ry ° (Ry UR) only ifR = AX A.
= (Ry o Ry) U (RM; o R3); and (b) R; o (Mz NRz) c) Use the result of Exercise 11, along with the Princi-
(R, o Ra) N(R o Ra). ple of Mathematical Induction, to prove that M(R”) =
7. For a relation R on a set A, define R° = ((a, a)la € A}. [M(R)]", for all m € Z*.
If |A| = 7, prove that there exists, € NwithO<s<t< 2" 13. Provide the proofs for Theorem 7.2(a), (b), and (d).
such that Re = RK’.
14. Use Theorem 7.2 to write a computer program (or to de-
8. With A= ({1,2,3,4}, let R= {d, 1), C1, 2), (2,3), velop an algorithm) for the recognition of equivalence relations
(3, 3), (3, 4), (4, 4)} be a relation on A. Find two relations &, on a finite set.
J on A where SFT but Rof=Ro7T = {C1 1), C1, 2),
15. a) Draw the digraph G; = (V,, E,;) where V, = {a, b,c,
(i, 4)}.
d.e, f) and E, = ((a,b), (@,d), (b, ©), (be), (d,),
9, How many6 X 6 (0, 1)-matrices A are there with A = A"? (d, e), (e,c), (@, f), (f, d)}.
7.2 Computer Recognition: Zero-One Matrices and Directed Graphs 355
b) Draw the undirected graph G2 = (V2, E2) where V2 = tion & C A X A in each case, as well as its associated relation
{s,¢t,4,v,w,x,y,z} and EF, = {{s, t}, {s, u}, {s, x}, matrix M(R).
(t, u}, {t, w}, (u, w}, (u, x}, {v, wh, {v, x}, {v, y}, (w, z}, 18. ForA = {v, w, x, y, z}, each of the following is the (0, 1)-
{x, y}}. matrix for a relation R on A. Here the rows (from top to bot-
16. For the directed graph G = (V, E) in Fig. 7.12, classify tom) and the columns (from left to right) are indexed in the
each of the following statements as true or false. order v, w, x, y, z. Determine the relation & C A X A in each
a) Vertex c is the origin of two edges in G. case, and draw the directed graph G associated with &.
b) Vertex g is adjacent to vertex h. 0s
ee ee
10141 1 4
c) There is a directed path in G from d to b.
a) M(A)=10 0 0 0 |
d) There are two directed cycles in G. 000 0 1
|0 0 0 0 0]
b
TO. 6u1dl6d1l lh OT
1 0 1 0 0
b) M(A=} 1 1 00 ~«21
10 0 0 1
0 0 1 1 «0
19, For A = {1, 2, 3, 4}, letR = (1, 1), C1, 2), (2, 3), GB, 3),
(3, 4)} be a relation on A. Draw the directed graph G on A that
is associated with R. Do likewise for R?, R>, and R*.
g
20. a) Let G =(V, E) be the directed graph where V =
Figure 7.12 {1, 2, 3, 4,5, 6, 7} and FE = {Gi, ll <i <j <7}.
i) How many edges are there for this graph?
17, For A = {a, b, c,d, e, f}, each graph, or digraph, in ii) Four of the directed paths in G from 1 to 7 may be
Fig. 7.13 represents a relation & on A. Determine the rela- given as:
1) (1, 7);
2) (1, 3), (3, 5), , 6), (6, 7);
3) C1, 2), (2, 3), G, 7); and
4) (1, 4), (4, 7).
How many directed paths (in total) exist in G from
1 to 7?
b) Now let n € Z* where n> 2, and consider the di-
rected graph G = (V, E) with V = {1, 2,3,..., } and
E={@, )ll<i<j <n}.
i) Determine |£]|.
ii) How many directed paths exist in G from 1 to n?
iii) If a,be Z* with 1 <a <b<n, how many di-
rected paths exist in G from a to b?
b b (The reader may wish to refer back to Exercise 20 in
Section 3.1.)
Cc Cc
a a 21. Let |A| = 5. (a) How many directed graphs can one con-
struct on A? (b) How many of the graphs in part (a) are actually
undirected?
d d 22. For |A| = 5, how many relations R# on A are there? How
many of these relations are symmetric?
e f e ef
23. a) Keeping the order of the elements fixed as 1, 2, 3, 4, 5,
determine the (0, !) relation matrix for each of the equiva-
lence relations in Example 7.33.
(inl) (iv)
b) Do the results of part (a) lead to any generalization?
Figure 7.13
356 Chapter 7 Relations: The Second Time Around
24. How many (undirected) edges are there in the complete the smallest integer n > 1, such that 2” = R. What is the
graphs K,, K;, and K,, where n € Z*? smallest value of n > 1 for which the graph of &R” con-
25. Draw a precedence graph for the following segment found tains some loops? Does it ever happen that the graph of 2”
at the start of a computer program: consists of only loops?
b) Answer the same questions from part (a) for the rela-
s 1 :=1
ed
tion R on A = {1, 2, 3,..., 9, 10}, if the directed graph
ye
Ss 2 := 2
associated with & is as shown in Fig. 7.15.
NNN
Ss 3 :=at+3
qo
S 4 := b
Ss 5 r= 2*a-l
Aa7ra
Ss 6 a*c
Ss 7 := 7
Ss 8 :=C+2
26. a) Let R be the relation on A = {1, 2, 3, 4, 5, 6, 7}, where
the directed graph associated with & consists of the two Figure 7.15
components, each a directed cycle, shown in Fig. 7.14. Find
c) Do the results in parts (a) and (b) indicate anything in
general?
27. If the complete graph K,, has 703 edges, how many vertices
does it have?
4 3 7 6
Figure 7.14
7.3
Partial Orders: Hasse Diagrams
If you ask children to recite the numbers they know, you’ll hear a uniform response of
“1, 2,3,....” Without paying attention to it, they list these numbers in increasing order.
In this section we take a closer look at this idea of order, something we may have taken for
granted, We start with some observations about the sets N, Z, Q, R, and C.
The set N is closed under the binary operations of (ordinary) addition and multiplication,
but if we seek an answer to the equation x + 5 = 2, we find that no element of N provides
a solution. So we enlarge N to Z, where we can perform subtraction as well as addition and
multiplication. However, we soon run into trouble trying to solve the equation 2x + 3 = 4.
Enlarging to Q, we can perform nonzero division in addition to the other operations. Yet
this soon proves to be inadequate; the equation x* — 2 = 0 necessitates the introduction
of the real but irrational numbers + /2. Even after we expand from Q to R, more trouble
arises when we try to solve x” + 1 = 0. Finally we arrive at C, the complex numbers,
where any polynomial equation of the formc,x” + ¢,-yx"7! +--+ + oox7 + e4x +06) = 0,
where c; € C forO <i <n,n >Oandc, #0, can be solved. (This result is known as the
Fundamental Theorem of Algebra. Its proof requires material on functions of a complex
variable, so no proof is given here.) As we kept building up from N to C, gaining more
ability to solve polynomial equations, something was lost when we went from R to C. In
R, given numbers 7}, r2, with r; 4 r2, we know that either r} < ry or mr. < r;. However,
in C we have (2+ 7) # (1 +27), but what meaning can we attach to a statement such
73 Partial Orders: Hasse Diagrams 357
as “(2 +24) < (1 + 27)? We have lost the ability to “order” the elements in this number
system!
AS we Start to take a closer look at the notion of order we proceed as in Section 7.1
and let A be a set with & a relation on A. The pair (A, %) is called a partially ordered
set, or poset, if relation R on A is a partial order, or a partial ordering relation (as given in
Definition 7.6). If A is called a poset, we understand that there is a partial order & on A
that makes A into this poset. Examples 7.1(a), 7.2, 7.11, and 7.15 are posets.
|_ EXAMPLE 7.34 Let A be the set of courses offered at a college. Define the relation ® on A by x R y if x, y
are the same course or if x is a prerequisite for y. Then &% makes A into a poset.
Define R on A = {1, 2, 3, 4} by x R y if x|y — that is, x (exactly) divides y. Then R =
EXAMPLE 7.35
{d, Ll), (2, 2), G, 3), (4, 4), , 2), 1, 3), (1, 4, (2, 4} is a partial order, and (A, &) is
a poset. (This is similar to what we learned in Example 7.15.)
In the construction of a house certain jobs, such as digging the foundation, must be performed
EXAMPLE 7.36
before other phases of the construction can be undertaken. If A is a set of tasks that must
be performed in building a house, we can define a relation R% on A by x R y if x, y denote
the same task or if task x must be performed before the start of task y. In this way we
place an order on the elements of A, making it into a poset that is sometimes referred to
as a PERT (Program Evaluation and Review Technique) network. (Such networks came
into play during the 1950s in order to handle the complexities that arose in organizing the
many individual activities required for the completion of projects on a very large scale. This
technique was actually developed and first used by the U.S. Navy in order to coordinate the
many projects that were necessary for the building of the Polaris submarine.)
Consider the diagrams given in Fig. 7.16. If part (a) were part of the directed graph
associated with a relation ®, then because (1, 2), (2, 1) € R with 1 4 2, R could not be
antisymmetric. For part (b), if the diagram were part of the graph of a transitive relation R,
then (1, 2), (2,3) «R= (1, 3) eR. Since (3, 1) € KR and 1 ¥ 3, RK is not antisymmetric,
so it cannot be a partial order.
(a) (b)
Figure 7.16
From these observations, if we are given a relation & on a set A, and we let G be the
directed graph associated with &, then we find that:
i) If G contains a pair of edges of the form (a, b), (b, a), fora, b € A witha # b, or
358 Chapter 7 Relations: The Second Time Around
ii) If & is transitive and G contains a directed cycle (of length greater than or equal to
three),
then the relation & cannot be antisymmetric, so (A, &) fails to be a partial order.
Consider the directed graph for the partial order in Example 7.35. Figure 7.17(a) is the
EXAMPLE 7.37
graphical representation of &. In part (b) of the figure, we have a somewhat simpler dia-
gram, which is called the Hasse diagram for R.
odo
(a) (b)
Figure 7.17
When we know that a relation & is a partial order on a set A, we can eliminate the loops
at the vertices of its directed graph. Since & is also transitive, having the edges (1, 2) and
(2, 4) is enough to insure the existence of edge (1, 4), so we need not include that edge. In
this way we obtain the diagram in Fig. 7.17(b), where we have not lost the directions on
the edges — the directions are assumed to go from the bottom to the top.
In general, if R is a partial order on a finite set A, we construct a Hasse diagram for
RR on A by drawing a line segment from x up to y, if x, y € A with x R y and, most
important, if there is no other element z € A such that x Rz and z KR y. (So there is
nothing “in between” x and y.) If we adopt the convention of reading the diagram from
bottom to top, then it is not necessary to direct any edges.
In Fig. 7.18 we have the Hasse diagrams for the following four posets. (a) With WU = {1, 2, 3}
EXAMPLE 7.38
and A = PAUL), KR is the subset relation on A. (b) Here & is the “(exactly) divides” relation
12 385
A’
2 3 5 7 11
(d)
Figure 7.18
73 Partial Orders: Hasse Diagrams 359
applied to A = {1, 2, 4, 8}. (c) and (d) Here the same relation as in part (b) is applied to
{2, 3, 5, 7} in part (c) and to {2, 3, 5, 6, 7, 11, 12, 35, 385} in part (d). In part (c) we note
that a Hasse diagram can have all isolated vertices; it can also have two (or more) connected
pieces, as shown in part (d).
Let A = {1, 2, 3, 4, 5}. The relation R on A, defined by x R y if x < y, is a partial order.
EXAMPLE 7.39
This makes A into a poset that we can denote by (A, <). If B = {1, 2, 4} Cc A, then the set
(BX B)NR = {d, 1), (2, 2), (4, 4, 1, 2), A, 4, @, 4)} is a partial order on B.
In general if & is a partial order on A, then for each subset B of A, (B X B) NR makes
B into a poset where the partial order on B is induced from KR.
We turn now to a special type of partial order.
Definition 7.16 If (A, A) is a poset, we say that A is totally ordered (or, linearly ordered) if forall x, y€ A
either x R y or y R x. In this case R is called a total order (or, a linear order).
a) On the set N, the relation & defined by x & y ifx < y is a total order.
EXAMPLE 7.40
b) The subset relation applied to A = PU), where U = {1, 2, 3}, is a partial, but not
total, order: {1, 2}, {1, 3} € A but we have neither {1, 2} € {1, 3} nor {1, 3} € {1, 2}.
c) The Hasse diagram in part (b) of Fig. 7.18 shows a total order. In Fig. 7.19(a) we have
the directed graph for this total order
— alongside its Hasse diagram in part (b).
Figure 7.19
Could these notions of partial and total order ever arise in an industrial problem?
Say a toy manufacturer is about to market a new product and must include a set of
GF D instructions for its assembly. In order to assemble the new toy, there are seven tasks, denoted
A, B,C, ..., G, that one must perform in the partial order given by the Hasse diagram of
Fig. 7.20. Here we see, for example, that all of the tasks B, A, and E must be completed
C before we can work on task C. Since the set of instructions is to consist of a listing of these
A tasks, numbered 1, 2, 3, ..., 7, how can the manufacturer write the listing and make sure
that the partial order of the Hasse diagram is maintained?
What we are really asking for here is whether we can take the partial order R, given by
B E the Hasse diagram, and find a total order J on these tasks for which R C F. The answer is
Figure 7.20 yes, and the technique that we need is known as topological sorting.
Chapter 7 Relations: The Second Time Around
Topological Sorting Algorithm
(for a partial order ® on a set A with |A] = n)
Step I: Set k = 1. Let H; be the Hasse diagram of the partial order.
Step 2: Select a vertex v;, in Hj, such that no (implicitly directed) edge in Hy starts
at vz.
Step 3: If k = n, the process is completed and we have a total order
Fs Uy < Vay Se << YY
that contains
R.
If k <n, then remove from H, the vertex v; and all (implicitly directed) edges of Hy
that terminate at v,. Call the result Hy,1. Increase k by 1 and return to step (2).
Here we have presented our algorithm as a precise list of instructions, with no concern
about the particulars of the pseudocode used in earlier chapters and with no reference to its
implementation in a particular computer language.
Before we apply this algorithm’ to the problem at hand, we should observe the deliberate
use of “a” before the word “vertex’”’ in step (2). This implies that the selection need not be
unique and that we can get several different total orders J containing R. Also, in step (3), for
vertices v;_; where 2 <i <n, the notation v, < v;_1 is used because it is more suggestive
of “vu; before v,_,” than is the notation v; FT v,_}.
In Fig. 7.21, we show the Hasse diagrams that evolve as we apply the topological sorting
algorithm to the partial order in Fig. 7.20. Below each diagram, the total order is listed as
it evolves.
(K=1) H, | (k=2) Ho] (kK =3) Hz] (kK=4) Hg | (kK=5) He | (kK =6) He | (kK=7) Hy
G F DIG F G
C Cc C C
A A A A
/\ e e e
B E B E B E B E B E B E E
D F<D G<F<D C<G A<C<G B<A<C |E<B<A<C
<F<D <F<D <G<F<D |<G<F<D
Figure 7.21
If the toy manufacturer writes the instructions in a list as 1-E, 2-B, 3-A, 4-C, 5-G, 6-F
7-D, he or she will have a total order that preserves the partial order needed for correct
assembly. This total order is one of 12 possible answers.
Here we are only concerned with applying this algorithm. Hence we are assuming that it works and we shall
not present a proof of that fact. Furthermore, we may operate similarly with other algorithms we encounter.
7.3 Partial Orders: Hasse Diagrams 361
As is typical in discrete and combinatorial mathematics, this algorithm provides a pro-
cedure that reduces the size of the problem with each successive application.
The next example provides a situation where the number of distinct total orders for a
particular partial order is determined.
Let p, g be distinct primes. In part (a) of Fig. 7.22 we have the Hasse diagram for the partial
EXAMPLE 7.41°
order & of all positive-integer divisors of p7g. Applying the topological sorting algorithm
to this Hasse diagram, we find in Fig. 7.22(b) the five total orders J;, where R C F;, for
1<i <5.
p°G>pq>q>p*>p>
p*q(+) +,4+,4+,-,-,-
J>:p°a>pqa>p*>p>gq>1
pq(+) p*(-) Fete Fe
Tz: p°q>p?>pq>q>p>
+,-,4,4,-,-
Ty p°g>pq>p?>q>p>
+,4,-,+,-,-
1(—) Ts p°q>p?>pq>p>q>1
+,-,+,-,4+,-
Figure 7.22
Now look at Fig. 7.22 again. This time focus on the three plus signs and three minus
signs in part (a) of the figure and in the list below each total order in part (b). When we
apply the topological sorting algorithm to the given partial order &, step (2) of the algorithm
implies that the first divisor selected is always p*q. This accounts for the first plus sign in
each J;, 1 <i <5. Continuing to apply the algorithm we get two more plus signs and the
three minus signs.
Could there ever be more minus signs than plus signs in our corresponding list, as a total
order is developed? For example, could we start with +, —, —,? If so, we have failed to
correctly apply step (2) of the topological sorting algorithm
— we should have recognized
pq as the unique candidate to select after p*g and p*. In fact, for0 <k <2, p*g must be
selected before p* can be. Consequently, for each list of three plus signs and three minus
signs, there is always at least as many plus signs as minus signs, as the list is read from
left to right. Comparing now with the result in part (a) of Example 1.43, we see that the
number of total orders for the given partial order is 5 = rai (73°). Further, for x > 1, the
topological sorting algorithm can be applied to the partial order of all positive divisors of
p"~'q to yield —(7”) total orders, another instance where the Catalan numbers arise.
In the topological sorting algorithm, we saw how the Hasse diagram was used in deter-
mining a total order containing a given poset (A, 2%). This algorithm now prompts us to
examine further properties of a partial order. At the start, particular emphasis will be given
This example refers back to the optional material on Catalan numbers in Section 1.5. It may be skipped with
no loss of continuity.
362 Chapter 7 Relations: The Second Time Around
to a vertex like the vertex v, in step (2) of the algorithm. The special property exhibited by
such a vertex is now considered in the following.
Definition 7.17 If (A, R) is a poset, then an element x € A is called a maximal element of A if for all
aéA,a#x =x Ra. Anelement y € A is called a minimal element of A if whenever
be Aandb# y,thenb Ry.
If we use the contrapositive of the first statement in Definition 7.17, then we can state
that x(€ A) is a maximal element if foreach a € A, x Ra > x =a. Ina similar manner,
y € Aisa minimal element if foreachbe A,DRy>ab=y.
EXAMPLE 7.42 | Let U = {1, 2, 3} and A = PMU).
a) Let & be the subset relation on A. Then U is maximal and 9 is minimal for the poset
(A, ©).
b) For B, the collection of proper subsets of {1, 2, 3}, let R be the subset relation on B.
In the poset (8, ©), the sets {1, 2}, {1, 3}, and {2, 3} are all maximal elements; ¢ is
still the only minimal element.
With & the “less than or equal to” relation on the set Z, we find that (Z, <) is a poset with
EXAMPLE 7.43
neither a maximal nor a minimal element. The poset (N, <), however, has minimal element
0 but no maximal element.
When we look back at the partial orders in parts (b), (c), and (d) of Example 7.38, the
EXAMPLE 7.44
following observations come to light.
1) The partial order in part (b) has the unique maximal element 8 and the unique minimal
element 1.
2) Each of the four elements — 2, 3,5, and 7 — is both a maximal element and a minimal
element for the poset in part (c) of Example 7.38.
3) In part (d) the elements 12 and 385 are both maximal. Each of the elements 2, 3, 5,
7, and 11 is a minimal element for this partial order.
Are there any conditions indicating when a poset must have a maximal or minimal
element?
THEOREM 7.3 If (A, R) is a poset and A is finite, then A has both a maximal and a minimal! element.
Proof: Leta, ¢ A. If there is no element a € A wherea # a, anda, & a, then a, is maximal.
Otherwise there is an element az € A witha) # a; anda; Ra. Ifnoelementa € A,a # a,
satisfies a2 R a, then az is maximal. Otherwise we can find a3 € A so that a3 # a2, a3 Fa
(Why?) while a; & a2 and a2 A a3. Continuing in this manner, since A is finite, we get to
an element a, € A with a, Za for alla € A where a # ay, SO dy is maximal.
The proof for a minimal element follows in a similar way.
73 Partial Orders: Hasse Diagrams 363
Returning now to the topological sorting algorithm, we see that in each iteration of
step (2) of the algorithm, we are selecting a maximal element from the original poset (A, 2),
or a poset of the form (B, 2’) where @ # BC A and R’ = (B X B) NR. At least one such
element exists (in each iteration) by virtue of Theorem 7.3. Then in the second part of
step (3), if x is the maximal element selected [in step (2)], we remove from the present
poset all elements of the form (a, x). This results in a smaller poset.
We turn now to the study of some additional concepts involving posets.
Definition 7.18 If (A, &) is a poset, then an element x € A is called a east element if x R a for alla € A.
Element y € A is called a greatest element if a & y for alla € A.
Let UW = {1, 2, 3}, and let & be the subset relation.
EXAMPLE 7.45
a) With A = PU), the poset (A, C) has @ as a least element and “Ul as a greatest element.
b) For B = the collection of nonempty subsets of °U, the poset (B, C) has Ui as a greatest
element. There is no least element here, but there are three minimal elements.
For the partial orders in Example 7.38, we find that
EXAMPLE 7.46
1) The partial order in part (b) has a greatest element 8 and a least element 1.
2) There is no greatest element or least element for the poset in part (c).
3) No greatest element or least element exists for the partial order in part (d).
We have seen that it is possible for a poset to have several maximal and minimal elements.
What about least and greatest elements?
THEOREM 7.4 If the poset (A, %) has a greatest (least) element, then that element is unique.
Proof: Suppose that x, y € A and that both are greatest elements. Since x is a greatest
element, y R x. Likewise, x R y because y is a greatest element. As KR is antisymmetric, it
follows thatx = y.
The proof for the least element is similar.
Definition 7.19 Let (A, %) be a poset with B C A. Anelement x € A is called a lower bound of B ifx Rb
for all b € B. Likewise, an element y € A is called an upper bound of B if b&R y for all
be B.
Anelement x’ € A is called a greatest lower bound (gib) of B if it is a lower bound of B
and if for all other lower bounds x” of B we have x” R x’. Similarly y’ € A is a least upper
bound (lub) of B if it is an upper bound of B and if y’ R y” for all other upper bounds y”
of B.
Let U = {1, 2, 3, 4}, with A = PU), and let R be the subset relation on A. If B=
EXAMPLE 7.47
{{1}, {2}, {1, 2}}, then {1, 2}, {1, 2, 3}, {1, 2, 4}, and {1, 2, 3, 4} are all upper bounds for
364 Chapter 7 Relations: The Second Time Around
B (in (A, &)), whereas {1, 2} is a least upper bound (and is in B). Meanwhile, a greatest
lower bound for B is %, which is not in B.
Let & be the “less than or equal to” relation for the poset (A, R).
EXAMPLE 7.48
a) If A= R and B = (0, 1], then B has glb 0 and lub 1. Note that 0, 1 € B. For C =
(O, 1], C has glb O and lub 1, and1 eC butO ¢C.
b) Keeping A = R, let B = {q € Qlq? < 2}. Then B has V2 as a lub and —V/2 as a glb,
and neither of these real numbers is in B.
c) Now let A = Q, with B as in part (b). Here B has no lub or glb.
These examples lead us to the following result.
THEOREM 7.5 If (A, R) is a poset and B C A, then B has at most one lub (glb).
Proof: We leave the proof to the reader.
We close this section with one last ordered structure.
Definition 7.20 The poset (A, &) is called a lattice if for all x, y € A the elements lub{x, y} and glb{x, y}
both exist in A.
ForA = Nand x, y EN, definex R y byx < y. Then lub{x, y} = max{x, y}, glb{x, y} =
EXAMPLE 7.49
min{x, y}, and (N, <) is a lattice.
For the poset in Example 7.45(a), if S, T CU, with lub{S, T} = SUT and glb{S, T} =
EXAMPLE 7.50
SOT, then (PU), C) is a lattice.
Consider the poset in Example 7.38(d). Here we find, for example, that
EXAMPLE 7.51
lub{2, 3} = 6, lub{3, 6} = 6, lub{5, 7} = 35, lub{7, 11} = 385, lub{11, 35} = 385,
and
glb{3, 6} = 3, glb{2, 12} = 2, glb{35, 385} = 35.
However, even though lub{2, 3} exists, there is no glb for the elements 2 and 3. In ad-
dition, we are also lacking (among other considerations) glb{5. 7}, glb{11, 35}, glb{3, 35},
and lub{3, 35}. Consequently, this partial order is not a lattice.
3. Let (A, #1), (B, R2) be two posets. On A X B, define re-
EXERCISES 7.3 lation R by (a, b) R(x, y) if aR x and b Rp y. Prove that
R is a partial order.
1. Draw the Hasse diagram for the poset (P(U), C), where
U = (1, 2, 3, 4}. 4. If R,, Kz in Exercise 3 are total orders, is R a total order?
2. Let A = {1, 2, 3, 6, 9, 18}, and define R on A by x KR y if 5. Topologically sort the Hasse diagram in part (a) of Exam-
x|y. Draw the Hasse diagram for the poset (A, &). ple 7.38.
73 Partial Orders: Hasse Diagrams 365
6. For A = {a, b, c, d, e}, the Hasse diagram for the poset a) B= {{1}, (2}}
(A, R) is shown in Fig. 7.23. (a) Determine the relation ma-
b) B= {{1}, {2}, {3}, (1. 2}
trix for R. (b) Construct the directed graph G (on A) that is
associated with &. (c) Topologically sort the poset (A, R). c) B= {6, (1}, {2}. {1, 2})
d) B= {{1}, (1, 2}, (1, 3}, (1, 2, 3h)
7, The directed graph G forarelation® onsetA = {1, 2, 3, 4}
is shown in Fig. 7.24. (a) Verify that (A, ®) is a poset and e) B= {{i}, {2}. (3), (1. 2}, (1, 3}, (2. 3}}
find its Hasse diagram. (b) Topologically sort (A, R). (c) How
many more directed edges are needed in Fig. 7.24 to extend 18. Let = {1, 2, 3, 4, 5, 6, 7}, with A = P(A), and let R be
(A, R) to a total order? the subset relation on A. For B = {{1}, {2}, {2, 3}} C A, deter-
mine each of the following.
e a) The number of upper bounds of 8 that contain (i) three
elements of “Ul; (ii) four elements of U; (iii) five elements
of U
d b) The number of upper bounds that exist for B
c) The lub for B
b Cc
d) The number of lower bounds that exist for B
e) The glb for B
Figure 7.23 Figure 7.24
a .
19. Define the relation R& on the set Z by aRb ifa —bisa
8. Prove that if a poset (A, &) has a least element, it is unique. nonnegative even integer. Verify that R defines a partial order
for Z. Is this partial order a total order?
9, Prove Theorem 7.5.
20. For X = {0, 1}, let A= X X X. Define the relation R
10. Give an example of a poset with four maximal elements but
on A by (a, b) R (c, d) if (i)a <c; or (ii)a =c and b <d.
no greatest element.
(a) Prove that & is a partial order for A. (b) Determine all min-
11. If (A, &) is a poset but not a total order, and #4 # BCA, imal and maximal elements for this partial order. (c) Is there
does it follow that (B X B) 1 R makes B into a poset but not a least element? Is there a greatest element? (d) Is this partial
a total order? order a total order?
12, If R is a relation on A, and G is the associated directed
21. Let X = {0, 1, 2} and A = X X X. Define the relation R
graph, how can one recognize from G that (A, &) is a total
on A as in Exercise 20. Answer the same questions posed in
order?
Exercise 20 for this relation & and set A.
13. If G is the directed graph for a relation & on A, with
|A| =n, and (A, &) is a total order, how many edges (including 22, For ne Zt, let X ={0,1,2,...,2—1,n} and A=
loops) are there in G? X X X. Define the relation R on A as in Exercise 20. Remem-
ber that each element in this total order R is an ordered pair
14, Let M(&) be the relation matrix for relation R on A, with whose components are themselves ordered pairs. How many
|A| =n. If (A, &) is a total order, how many 1’s appear in such elements are there in R?
M(R)?
23. Let (A, &) be a poset. Prove or disprove each of the fol-
15. a) Describe the structure of the Hasse diagram for a totally
lowing statements.
ordered poset (A, &), where |A| =n > 1.
a) If (A, &) is a lattice, then it is a total order.
b) For a set A where |A| = n > 1, how many relations on
A are total orders? b) If (A, &) is a total order, then it is a lattice.
16. a) For A = {a;, @,...,a,}, let (A, R) be a poset. If 24, If (A, &) is a lattice, with A finite, prove that (A, R) has a
M(&) is the corresponding relation matrix, how can we greatest element and a least element.
recognize a maximal or minimal element of the poset from
25. For A = {a, b,c, d,e, v, w, x, y, z}, consider the poset
M(R)?
(A, R) whose Hasse diagram is shown in Fig. 7.25. Find
b) How can one recognize the existence of a greatest or
a) glb{b, c} b) glb{b, w}
least element in (A, &) from the relation matrix M(R)?
c) glb{e, x} d) lub{c, >}
17. Let% = {1, 2, 3, 4}, with A = PU), and letR be the sub-
set relation on A. For each of the following subsets B (of A), e) lub{d, x} f) lub{c, e}
determine the lub and gib of B. g) lub{a, v}
366 Chapter 7 Relations: The Second Time Around
Is (A, &) a lattice? Is there a maximal element? a minimal c) A ={a},4a,...,a,} CZ*,n>1,
element? a greatest element? a least element? a, <@),<:--<a,,
B= {1,2};
d) A = {1, 2}, B = {1, 2, 3, 4};
e) A={1,2), B={l,...,n},n > 1; and
f) A= {1,2}, B={bj,b,...,8})
CZ, n>1,
bi < by <---
< by.
27. Let p, g, 7, s be four distinct primes and m,n, k, £€ Z*.
How many edges are there in the Hasse diagram of all posi-
tive divisors of (a) p*: (b)p™: (c) p°q?s (d) pq"; (©) pq’r*;
(F) p™g"r; (g) peq?r’s’, and (h) p"q"ris!?
28. Find the number of ways to totally order the partial order
of all positive-integer divisors of (a) 24; (b) 75; and (c) 1701.
Figure 7.25
29. Let p,q be distinct primes and k € Z*. If there are 429
26. Given partial orders (A, ®) and (B, Ff), a function f: ways to totally order the partial order of positive-integer divi-
A— B iscalled order-preserving if forallx, ye Ax Ry as sors of p*g, how many positive-integer divisors are there for
f(x) £ f(y). How many such order-preserving functions are this partial order?
there for each of the following, where ®, F both denote < (the 30. Form, n € Z*, let A be the set of all m X n (0, 1)-matrices.
usual “less than or equal to” relation)? Prove that the “precedes” relation of Definition 7.11 makes A
a) A = {1, 2, 3. 4}, B= (1, 2} into a poset.
b) A={l,...,n},221, B= (1, 2}
7.4
Equivalence Relations and Partitions
As we noted earlier in Definition 7.7, a relation & on a set A is an equivalence relation
if it is reflexive, symmetric, and transitive. For any set A # 9, the relation of equality is
an equivalence relation on A, where two elements of A are related if they are identical;
equality thus establishes the property of “sameness” among the elements of A.
If we consider the relation & on Z defined by x R y ifx — y is a multiple of 2, then R
is an equivalence relation on Z where all even integers are related, as are all odd integers.
Here, for example, we do not have 4 = 8, but we do have 4 & 8, for we no longer care
about the size of a number but are concerned with only two properties: “evenness” and
“oddness.” This relation splits Z into two subsets consisting of the odd and even integers:
Z={...,—-3,-1,1,3,...}U{..., -4, -2,0,2,4,...}. This splitting up of Z is an
example of a partition, a concept closely related to the equivalence relation. In this section
we investigate this relationship and see how it helps us count the number of equivalence
relations on a finite set.
Definition 7.21 Given a set A and index set J, let @ A A; C A for eachi € J. Then {A;};<; is a partition of
Aif
a) A= U4 and = b) A; Aj = G, foralli,
j ¢ J wherei # j.
re
Each subset A; is called a cell or block of the partition.
EXAMPLE 7,52 If A = {1, 2, 3, ..., 10}, then each of the following determines a partition of A:
a) A; = {1, 2,3, 4, 5}, Ao = {6, 7, 8, 9, 10}
7.4 Equivalence Relations and Partitions 367
b) A; = {1, 2, 3}, Ao = {4, 6, 7, 9}, As = {5, 8, 10}
ce) A, = {i,i +5},
1 <i<5
In these three examples we note how each element of A belongs to exactly one cell in each
partition.
| EXAMPLE 7.53 Let A = R and, for each i € Z, let A; = [i, i + 1). Then {A;}j<z is a partition of R.
Now just how do partitions come into play with equivalence relations?
Definition 7.22 Let & be an equivalence relation on a set A. For each x € A, the equivalence class of x,
denoted [x], is defined by [x] = {y € Aly R x}.
Define the relation R on Z by x KR y if 4|(x — y). Since ®R is reflexive, symmetric, and
EXAMPLE 7.54
transitive, it is an equivalence relation and we find that
]={..., -8, 4,0, 4, 8, 12, ...} = {4klk eZ}
]={...,—-7, -3, 1,5, 9, 13,...}= (4k 4 lk eZ}
[2] ={..., -6, —2, 2, 6, 10, 14,...}= {4k + 21k eZ}
]=(...,—-5, -1, 3, 7, 1, 15,...} = {4k + 3]k € Z}.
But what about [7], where # is an integer other than 0, 1, 2, or 3? For example, what
is [6]? We claim that [6] = [2] and to prove this we use Definition 3.2 (for the equality of
sets) as follows. If x € [6], then from Definition 7.22 we know that x A 6. Here this means
that4 divides (x
— 6), so x — 6 = 4k for some
k € Z. But then x —6 = 4k > x -2=
4(k + 1) => 4 divides (x — 2) = x R2 => x € [2], so [6] C [2]. For the opposite inclusion
start with an element y in [2]. Then y € [2] > y R2 = 4 divides (y — 2) > y — 2 = 41 for
some/eZ=> y—6=4( — 1), where! -1e€Z=> 4 divides y-6>5 yR6> ye [6],
so [2] C [6]. From the two inclusions it now follows that [6] = [2], as claimed.
Further, we also find, for example, that [2] = [—2] = [—6], [51] = [3], and [17] = [1].
Most important, {[0], [1], [2], [3]} provides a partition of Z.
[Note: Here the index set for the partition is implicit. If, for instance, we let Ag = [0],
A; = [1], A2 = [2], and A3 = [3], then one possible index set J (as in Definition 7.21) is
{0, 1, 2, 3}. When a collection of sets is called a partition (of a given set) but no index set
is specified, the reader should realize that the situation is like the one given here — where
the index set is implicit. ]
Define the relation R on the set Z by a R b if a? = b? (or,a = +b). Foralla € Z, we have
EXAMPLE 7.55
a’ =a*—so a Ra and & is reflexive. Should a, b € Z with a R b, then a? = b* and it
follows that b? = a?, or b Ra. Consequently, relation & is symmetric. Finally, suppose
that
a, b,c € Z withaRb and
b Rc. Then a? = b? and b* = c*, so a* =c* andaRe.
This makes the given relation transitive. Having established the three needed properties,
we now know that & is an equivalence relation.
What can we say about the corresponding partition of Z?
368 Chapter 7 Relations: The Second Time Around
Here one finds that [0] = {0}, [1] = [—1] = {-1, 1}, [2] = [—2] = {-2, 2}, and,
in gen-
eral, for each n € Z*, [n] = [—n] = {—n, n}. Furthermore, we have the partition
Z = Uni = Umm)= {0} U ( Ui-n.m)) = {0} U ( U (-n.n)).
n=0 neN n=]
These examples lead us to the following general situation.
THEOREM 7.6 If R is an equivalence relation on a set A, and x, y € A, then (a) x € [x]; (b) x B y if and
only if [x] = [y]; and (c) [x] = [y] or [x] M1 [y] = B.
Proof:
a) This result follows from the reflexive property of R.
b) The proof here is somewhat reminiscent of what was done in Example 7.54.
Ifx R y,letw € [x]. Then w KR x and because
& is transitive, w R y. Hence w é€ [y]
and [x] C [y]. With & symmetric, x Ry > y Rx. So if te [y], then t R y and by
the transitive property, t R x. Hencet € [x] and [y] C [x]. Consequently, [x] = [y].
Conversely, let [x] = [y]. Since x € [x] by part (a), then x € [y] or x R y.
c) This property tells us that two equivalence classes can be related in only one of two
possible ways. Either they are identical or they are disjoint.
We assume that [x] # [y] and show how it then follows that [x] M [y] = @. If
[x] O Ly] # Y, then let v € A with v € [x] and v € [y]. Thenuv Rx, vu R y, and, since
R is symmetric, x R v. Now (x Rv andvuRk y) > x R y, by the transitive property.
Also x R y => [x] = [y] by part (b). This contradicts the assumption that [x] # [y],
sO we reject the supposition that [x] Ly] # 9, and the result follows.
Note that if & is an equivalence relation on A, then by parts (a) and (c) of Theorem 7.6
the distinct equivalence classes determined by & provide us with a partition of A.
a) If A ={1,2,3,4,5} and R= {d1, 1), (2, 2), (2, 3), (3, 2), GB. 3), (4, 4. (4.5),
EXAMPLE 7.56
(5, 4), (5, 5)}, then& is an equivalence relation on A. Here [1] = {1}, [2] = {2, 3} =
[3]. [4] = {4, 5} = [S], and A = [1] U [2] U [4] with [1] 9 [2] = @, [1] N [4] = @, and
[2] M [4] = @. So {[1], [2], [4]} determines a partition of A.
b) Consider part (d) of Example 7.16 once again. We have A = {1, 2, 3, 4, 5, 6, 7}, B =
{x, y, z}, and f: A > B is the onto function
f ={d, x), (2, 2). 3, x), 4, y), 6, 2), (6, y), 7, x)}.
The relation & defined on A by a R bif f(a) = f(b) was shown to be an equivalence
relation. Here
f-'@) = 01, 3,7) = 01) © [3] = (7),
f7'(y) = {4, 6} = [4] (= [6), and
f-'(@ = {2,5} = [21 © [5).
With A = [1] U[4] U [2] = f-'(x) U fo ' Gy) U £7'(2), we see that
(fl), f-'(), f—'(z)} determines a partition of A.
In fact, for any nonempty sets A, B, if f: A > B is an onto function, then A =
U,<e f-'(b) and { f~!(b)|b € B} provides us with a partition of A.
7.4 Equivalence Relations and Partitions 369
In the programming language C++ a nonexecutable specification statement called the union
EXAMPLE 7.57 . . .
construct allows two or more variables in a given program to refer to the same memory
location.
For example, within a program the statements
union
{
int a;
int c;
int p;
};
union
{
int up;
int down;
}i
inform the C++ compiler that the integer variables a, c, and p will share one memory
location while the integer variables up and down will share another. Here the set of all
program variables is partitioned by the equivalence relation R, where v,; R vp if v; and v2
are program variables that share the same memory location.
EXAMPLE 7.58 Having seen examples of how an equivalence relation induces a partition of a set, we now
. go backward. If an equivalence relation & on A = {1, 2, 3, 4, 5, 6, 7} induces the partition
A = {1, 2} U {3} U {4, 5, 7} U {6}, what is R?
Consider the cell {1, 2} of the partition. This subset implies that [1] = {1, 2} = [2], andso
(1, 1), (2, 2), (1, 2), (2, 1) eR. (The first two ordered pairs are necessary for the reflexive
property of &; the others preserve symmetry.)
In like manner, the cell {4, 5, 7} implies that under &, [4] = [5] = [7] = {4, 5, 7} and
that, as an equivalence relation, R must contain {4, 5, 7} x {4, 5, 7}. In fact,
AR = ({1, 2} * (1, 2) U C3} X (3) U C4, 5, 7} X (4, 5, 7)) U 6} X {6}),
and
(Al = 247434
P= 15.
The results in Examples 7.54, 7.55, 7.56, and 7.58 lead us to the following.
THEOREM 7.7 If A is a set, then
a) any equivalence relation & on A induces a partition of A, and
b) any partition of A gives rise to an equivalence relation R on A.
Proof: Part (a) follows from parts (a) and (c) of Theorem 7.6. For part (b), given a partition
{A, }ic, of A, define relation 2 on A by x R y, if. x and y are in the same cell of the partition.
We leave to the reader the details of verifying that & is an equivalence relation.
On the basis of this theorem and the examples we have examined, we state the next
result. A proof for it is outlined in Exercise 16 at the end of the section.
370 Chapter 7 Relations: The Second Time Around
THEOREM 7.8 For any set A, there is a one-to-one correspondence between the set of equivalence relations
on A and the set of partitions of A.
We are primarily concerned with using this result for finite sets.
EXAMPLE 7.59 a) If A == {1, 2, 3, 4, 5, 6}, how many relations
‘anc on A are equivalence
‘ Onc?
relations’
We solve this problem by counting the partitions of A, realizing that a partition
of A is a distribution of the (distinct) elements of A into identical containers, with
no container left empty. From Section 5.3 we know, for example, that there are
S(6, 2) partitions of A into two identical nonempty containers. Using the Stirling
numbers of the second kind, as the number of containers varies from 1 to 6, we have
e S(6, i) = 203 different partitions of A. Consequently, there are 203 equivalence
relations on A.
b) How many of the equivalence relations in part (a) satisfy 1, 2 € [4]?
Identifying 1, 2, and 4 as the “same” element under these equivalence relations, we
countas in part (a) forthe set B = {1, 3, 5, 6} and find that there are vt S(4, 7) = 15
equivalence relations on A for which [1] = [2] = [4].
We close by noting that if A is a finite set with |A| =”, then for all n <r <n’, there is
an equivalence relation & on A with || = r if and only if there exist n,, 12, . 12, mee Zt
with )°*_,n; =n and )°*_, n? =f.
6. For A = R®, define R on A by (x1, yi) R (Xo, yo) if
913 ah ee Xy = X2.
1. Determine whether each of the following collections of sets a) Verify that & is an equivalence relation on A.
is a partition for the given set A. If the collection is not a parti- b) Describe geometrically the equivalence classes and par-
tion, explain why it fails to be. tition of A induced by &.
a) A = {1, 2, 3,4,5,6,7, 8}; A,
= {4, 5, 6},
7, LetA = {1, 2, 3, 4, 5} X {1, 2, 3, 4, 5}, and define
R on A
Az = {1, 8}, A3 = (2, 3, 7}.
by (41, y1) RK (Xa, yo) thx + yy = x2 + yr.
b) A = {a, b, Cc, d, é, ff. 8.h}; A, = {d, e},
Az = {a,c, d}, Az = {fh}, Aq = (8, gl. a) Verify that &% is an equivalence relation on A.
2. Let A = {1, 2, 3, 4, 5, 6, 7, 8}. In how many ways can we b) Determine the equivalence classes [(1, 3)}, [(2, 4)], and
partition A as A; U A; U A; with [(, 1)].
a) 1,2€¢A,, 3,4€A>, and 5,6,7€ A3? c) Determine the partition of A induced by &.
b) 1,2€A);, 3,4€A2, 5,6€A3, and |A;| = 3? 8. IfA = {1, 2, 3, 4, 5, 6, 7}, define
R on A by (x, y) ERif
ce) 1,2€A), 3,46 A, and 5,6€ A3? xX — y isa multiple of 3.
3. If A = {1, 2, 3,4, 5} and & is the equivalence relation a) Show that & is an equivalence relation on A.
on A that induces the partition A = {1, 2} U (3, 4} U {5}, what
b) Determine the equivalence classes and partition of A
is AR?
induced by &.
4, ForA = {1, 2, 3, 4, 5, 6},& = {q, 1), C1. 2), (2, 1), (2, 2),
(3, 3), (4, 4), (4, 5), (5, 4), (5, 5), (6, 6)}
1s an equivalencere- 9, For A = {(—4, —20), (—3, —9), (—2, —4), (-1, 11),
lation on A. (a) What are [1], [2], and [3] under this equivalence (-1, —3), a, 2), qd, 5), (2, 10), (2, 14), (3, 6), (4, 8), (4, 12)
relation? (b) What partition of A does R induce? define the relation ® on A by (a, b) R (c, d) if ad = be.
5. If A = A; U A2 UA3, where A; = {1, 2}, Ao = {2, 3, 4}, a) Verify that % is an equivalence relation on A.
and A; = {5}, define relation R on A by x & yif x and y are in b) Find the equivalence classes [(2, 14)], [(—3, —9)], and
the same subset A,, for 1 <i <3. Is R an equivalence relation? [(4, 8)].
75 Finite State Machines: The Minimization Process 371
c) How many cells are there in the partition of A induced alence relations where v, w € [x]; (g) equivalence relations
by R? where w € [x] and y € [z]; and (h) equivalence relations where
10. Let A be a nonempty set and fix the set B, where B C A. w € [x], y € [z], and [x] $ [z].
Define the relation A on P(A) by X RY, for X, Y CA, if 13. If |A| = 30 and the equivalence relation ® on A partitions
BOX=BNY. A into (disjoint) equivalence classes A;, Az, and A3, where
a) Verify that # is an equivalence relation on P(A). |Ai| = |A2| = |A3|, what is ||?
b) If A = {1, 2,3} and B = {1, 2}, find the partition of 14. Let A = {1, 2, 3, 4, 5, 6, 7}. For each of the following val-
PA) induced by KR. ues of r, determine an equivalence relation & on A with |R| =
c) IfA = {1, 2, 3, 4, 5} and B = {1, 2, 3}, find LX] ifX = r, or explain why no such relation exists. (a) r = 6; (b) r = 7;
{1, 3, 5}. (c) r=8; (d) r=9; (ce) r= 11; (f) r = 22; (g) r = 23;
(h) r = 30; G)r = 31.
d) For A = {1, 2, 3, 4, 5} and B = {1, 2, 3}, how many
equivalence classes are in the partition induced by R? 15. Provide the details for the proof of part (b)} of Theo-
rem 7.7.
fl. How many of the equivalence relations on A=
{a, b,c, d, e, f} have (a) exactly two equivalence classes of 16. For any set A 4 @, let P(A) denote the set of all partitions
size 3? (b) exactly one equivalence class of size 3? (c) one of A, and let E(A) denote the set of all equivalence relations
equivalence class of size 4? (d) at least one equivalence class on A. Define the function f: E(A} > P(A) as follows: If R
with three or more elements? is an equivalence relation on A, then f (&) is the partition of
12, Let A = {v, w, x, y. z}. Determine the number of relations A induced by &. Prove that f is one-to-one and onto, thus
on A that are (a) reflexive and symmetric; (b) equivalence establishing Theorem 7.8.
relations; (c) reflexive and symmetric but not transitive; (d) 17, Let f: A— B. If {B,, Bo, Bs,..., B,} is a partition of
equivalence relations that determine exactly two equivalence B, prove that {f—'(B,)|1 <i <n, f7'(B,) # 9} is a partition
classes; (e) equivalence relations where w € [x]; (f) equiv- of A.
7.5
Finite State Machines:
The Minimization Process
In Section 6.3 we encountered two finite state machines that performed the same task but
had different numbers of internal states. (See Figs. 6.9 and 6.10.) The machine with the
larger number of internal states contains redundant states —- states that can be eliminated
because other states will perform their functions. Since minimization of the number of
states in a machine reduces its complexity and cost, we seek a process for transforming a
given machine into one that has no redundant internal states. This process is known as the
minimization process, and its development relies on the concepts of equivalence relation
and partition.
Starting with a given finite state machine M = (S, , ©, v, w), we define the relation
E, on S by s; E; 52 if w(s;, x) = w{s2, x), for all x € F. This relation E is an equivalence
relation on S, and it partitions S into subsets such that two states are in the same subset if
they produce the same output for each x € J. Here the states s;, s2 are called /-equivalent.
For each k € Z*, we say that the states s,, s2 are k-equivalent if w(s,, x) = w(s2, x) for
all x < $*, Here w is the extension of the given output function to § X $*. The relation of k-
equivalence is also an equivalence relation on S; it partitions S into subsets of k-equivalent
states. We write s; Ex s2 to denote that s; and sz are k-equivalent.
Finally, if s), 52 € S and s), sp are k-equivalent for all k > 1, then we call s; and s5
equivalent and write s; E s2. When this happens, we find that if we keep s; in our machine,
then s2 will be redundant and can be removed. Hence our objective is to determine the
partition of S induced by E and to select one state for each equivalence class. Then we shall
have a minimal realization of the given machine.
372 Chapter 7 Relations: The Second Time Around
To accomplish this, let us start with the following observations.
a) If two states in a machine are not 2-equivalent, could they possibly be 3-equivalent?
(or k-equivalent, for k > 4?)
The answer is no. If 5), 52 € § and s; E> s2 (that is, s; and s> are not 2-equivalent),
then there is at least one string xy € §* such that w(s), xy) = vv» FW |W? =
w(s2, xy), Where v1, v2, w;, W2 € O. So with regard to E3, we find that s, B; sz be-
cause for any z € #, w(s), xyz) = vjv2U3 A Wi w2W3 = w(s2, xyz).
In general, to find states that are (k + 1)-equivalent, we look at states that are
k-equivalent.
b) Now suppose that s;, 5: € S and s; Ey s>. We wish to determine whether $1 E3 5p.
That is, does wW(S], X1X2X3) = w(5, X1X2x3) for all strings X1X2X3 € $39? Con-
sider what happens. First we get w(s;, x1) = w(s2, x;), because Sy Ep 82 => 81 Ey 5.
Then there is a transition to the states v(s;,x,) and v(s2, X;). Consequently,
($1, X1X2X3) = W(S2, X1X2X3) if w(v(s1, x1), X2%3) = w(v(s2, x1), X2x3) [that is,
if v(sy, X;) Ey v(s2, x))].
In general, for s;, s2, € S, where s; Ex s2, we find that s; Ex, s2 if (and only if)
v(s}, X) Ex v{so, x) forall x € F.
With these observations to guide us, we now present an algorithm for the minimization of
a finite state machine M.
Step 1: Set k = 1. We determine the states that are 1-equivalent by examining the
rows in the state table for M. For s;, s2 € S it follows that s; E; 5. when s;, 52 have
the same output rows.
Let P; be the partition of S induced by E.
Step 2: Haying determined Py, we obtain P,.1 by noting that if s; EB, so, then
5; Ex41 $2 when v(s;, x) EB v(sz, x) for all x € $. We have s; Ex so if s;, 82 are
in the same cell of the partition P;. Likewise, v(s,, x) Ex v(s2, x) for each x € §,
if v(s;, x) and v(s2, x) are in the same cell of the partition P;. In this way Pr+; is
obtained from P;.
Step 3: If Pyi; = Py, the process is complete. We select one state from each equiv-
alence class and these states yield a minimal realization of M.
Tf Pia % Pe, we increase k by 1 and return to step (2).
We illustrate the algorithm in the following example.
With # = © = {0, 1}, let M be given by the state table shown in Table 7.1. Looking at the
EXAMPLE 7.60
output rows, we see that s3 and s4 are 1-equivalent, as are sz, ss, and sg. Here E, partitions
S as follows:
Py: {sy}, {82, 85, 86}, {83, $4}.
For each s € S and each k € Z*, s Ex s, so as we continue this process to determine P,, we
shall not concern ourselves with equivalence classes of only one state.
Since s3 E; s4, there is a chance that we could have 5; Ey s4. Here v(s3, 0) = $9,
v(s4, 0) = ss with sz E; s5,and v(s3, 1) = 54, v(s4, 1) = 53 with s4 E, 53. Hence v(s3, X) Ey
v(s4, x), for all x € £, and s3 Ey sq. Similarly, v(s2, 0) = s5, v(s5, 0) = s2 with s5 E, 59,
and v(s2, 1) = 59, v(ss, 1) = s5 with s> E; ss. Thus 57, Ey $5, Finally, v(s5, 0) = s> and
75 Finite State Machines: The Minimization Process 373
v(s6, 0) = 51, but sz B, 5, so s5 By s6. (Why don’t we investigate the possibility of
82 E> 56?) Equivalence relation E, partitions S as follows:
P2: {81}, {82, 85}, {83, Sa}, {S6}-
Since P, # P|, we continue the process to get P3. In determining whether s2 E3 55, we
see that v(s2, 0) = 55, v(ss, 0) = 8, and s5 Ey 59. Also, vise, 1) = 89, v(s5, 1) = 55, and
52 E> s5. With v(s2, x) Ex v(ss, x) forallx € %, we have sz E3 55. For 53, 54, (v(s3, 0) = 52)
E> (s5 = v(s4, 0)) and (v(s3, 1) = 54) Es (83 = v(sq4, 1)), so s3 E3 s4 and E3 induces the
partition P3: {51}, {82, Ss}, {83, 54}, {6}.
Table 7.1 Table 7.2
v @ y @
0 1;0 1 0 1/0 #1
S] S54 53 0 ] Sy S53 S3 0 1
S2 SS AY) 1 0 S52 S52 S2 ] 0
83 | Ss. s4|O0 O $3 |S 8 |0 O
S4 S55 53 0 0 S6 5] S6 ] 0
S5 52 S5 1 0
56 Sy 56 1 0
Now P3 = P» so the process is completed, as indicated in step (3) of the algorithm. We
find that ss; and s4 may be regarded as redundant states. Removing them from the table, and
replacing all further occurrences of them by s2 and s3, respectively, we arrive at Table 7.2.
This is a minimal machine that performs the same tasks as the machine given in Table 7.1.
If we do not want states that skip a subscript, we can always relabel the states in this
minimal machine. Here we would have s), 52, 53, 54 (= Sg), but this s4 is not the same s4
we started with in Table 7.1.
You may be wondering how we knew that we could stop the process when P; = P. For
after all, couldn’t it happen that perhaps Py # P3, or that Py = P; but Ps # P,? To prove
that this never occurs, we define the following idea.
Definition 7.23 If P;, P) are partitions of aset A, then P) is called a refinement of P,, and we write P, < P,,
if every cell of P, is contained inacell of P;. When P; < P; and P, # P, we write P, < P).
This occurs when at least one cell in P, is properly contained in a cell in P).
In the minimization process of Example 7.60, we had P3 = P, < P;. Whenever we
apply the algorithm, as we get Py; from Py, we always find that P,,; < Py, because
(k + 1)-equivalence implies k-equivalence. So each successive partition refines the pre-
ceding partition.
THEOREM 7.9 In applying the minimization process, if k > 1 and Py and P;,.) are partitions with Py4; =
P,, then P,., = P, forallr >k +1,
Proof: If not, let r (> k + 1) be the smallest subscript such that P,,; # P,. Then P,., < P,,
so there exist 51, 5. € S with s, E, s> but s, F,4, 52. But s; E, 82 > v(s1, x) E,-) v(s2, x),
374 Chapter 7 Relations: The Second Time Around
for all x € §, and with P, = P,_1, we then find that v(s;, x) E, v(s2, x), for all x € F, so
s, E,+1 82. Consequently, P,4; = P,.
We close this section with the following related idea. Let M be a finite state machine
with 5), 5. € S, and s;, sy not equivalent. If s, F; sz, then these states produce different
output rows in the state table for M. In this case it is easy to find an x <€ F such that
w(s1, xX) # w(s2, x), and this distinguishes these nonequivalent states. Otherwise, s; and
s2 produce the same output rows in the table but there is a smallest integer k > 1 such that
Ss) Ex sz buts; Fea s2. Now if we are to distinguish these states, we need to find a string x =
xyxX2+ + XEXE41 € GH! such that w(s1, x) # w(s2, x), even though w(sy, x1x2 +++ xX~) =
w(52, XX. +++ Xz). Such a string x is called a distinguishing string for the states s; and 59.
There may be more than one such string, but each has the same (minimal) length k + 1.
Before we try to find a distinguishing string for two nonequivalent states in a specific
finite state machine, let us examine the major idea at play here. So suppose that 5}, s2 € $
and that for some (fixed) k € Z* we have sj Ex 52 but s; F,,; 52. What can we conclude?
We find that
51 Buys 82> Any € F [v(s1, x1) Fy v(so, x1]
=> Fx, € $ Ax. € F [v(v(s1, x1), X2) Bey v(v(s2, 41), X2)),
or = Ax, € F Ax. € F [v(s1, x1x2) By_1 v(s2, x1%2)]
=> x1, x2, x3 € F [v(s1, X1X2x3) Bg_2 v(52, X1%2.%3)]
=> Fx, x9,..., 4; © F [v(s1, x1X2 ~~ = xj) Fey 1_j v(s2, X1X2 +++ x]
=> Fx),20,....X¢ € F [v(sy, xpx2 ++ XQ) By v(s2, xpx2 +++ Xe)].
This last statement about the states v(s1, x)x2---x,), V(S2, X1X2-- + XZ) not being
1-equivalent implies that we can find x,,; € # where
@(V(S], X1XQ-°+ + XK), Xe-n) F O(V(S2, XpX2 + ++ XE), Kear). (1)
That is, these single output symbols from © are different.
The result denoted by Eq. (1) also implies that
w(Sy, X) = W(Sy, XX. ++ XEXp41) F W(S2, X1XQ ++ + NEXK41) = O(52, X).
In this case we have two output strings of length k + 1 that agree for the first k symbols
and differ in the (k + 1)st symbol.
We shall use the preceding observations, together with the partitions P|, P2,..., Px,
P,+, of the minimization process, in order to deal with the following example.
From Example 7.60 we have the partitions shown below. Here sz E; s6, but 52 E> s5. So we
EXAMPLE 7.61 seek an input string x of length 2 such that w(s2, x) # w(s6, x).
1) We start at P), where for sz, sg, we find that v(s2, 0) = s5 and v(sg, 0) = s; are in
different cells of P; — that is,
85 = v(S2, 0) KF, v(s6, 0) = 51.
75 Finite State Machines: The Minimization Process 375
[The input 0 and output | (for @(s2, 0) = 1 = w(s¢, 0)) provide the labels for the
arrows going from the cells of P, to those of P}.]
P,: {s,}, {Sp, Ss I, {53, Sy}, {56}
0, 1 0,1
2) Working with s; and ss in the partition P; we see that
w(v(S2, 0), 0) = w(s5, 0) = 1 #0 = w(51, 0) = w(v(56, 9), 0).
3) Hence x = 00 is a minimal distinguishing string for s2 and s¢ because w(s2, 00) =
11 4 10 = w(s¢, 00).
EXAMPLE 7.62 Applying the minimization process to the machine given by the state table in part (a) of
Table 7.3, we obtain the partitions in part (b) of the table. (Here Py = P3.) We find that the
states s; and s4 are 2-equivalent but not 3-equivalent. To construct a minimal distinguishing
string for these two states, we proceed as follows:
1) Since s, F3 54, we use partitions P3 and P» to find x; € £ (namely, x, = 1) so that
(v(s1, 1) = 52) By (85 = v(s4, 1).
2) Then v(s), 1) By v(sg, 1) > Sx2 € F (here.x2 = 1) with (v(s;, 1), 1) FB, (v(sa, 1), 1),
or v(sy, 11) KB, v(s4, 11). We used the partitions P, and P, to obtain x2 = 1.
3) Now we use the partition P,; where we find that for x3 = 1 € §,
w(v(s;, 11), 1) =O 4 1 = @(v(s4, 11), 1) or
w(s,, 111) = 100 ¥ 101 = w(sq, 111).
In part (b) of Table 7.3, we see how we arrived at the minimal distinguishing string
x = 111 for these states. (Also note how this part of the table indicates that 11 is a minimal
distinguishing string for the states s2 and ss, which are 1-equivalent but not 2-equivalent.)
Table 7,3
v @ P,: {5}, S3}, {55}, {Sy}, {Ss}
1 LT \1
CO
St S4 S52 P,: {S1, S32, 54}, {S>}, {55}
O°
KF Oe
Dore
52 S5 59
co
S53 S4 892
P,: {51,53, Sq}, {55,55}
ooo
S4 S53 S5
S55 S92 S53
1,1! } 1,0
(a) (b)
376 Chapter 7 Relations: The Second Time Around
A great deal more can be done with finite state machines. Among other omissions, we
have avoided offering any rigorous explanation or proof of why the minimization process
works. The interested reader should consult the chapter references for more on this topic.
2. For the machine in Table 7.4(c), find a (minimal) distinguish-
Ad ee ing string for each given pair of states: (a) 51, 55; (b) 52, 533
(C) 55, 57.
1. Apply the minimization process to each machine in Table 7.4.
3. Let M be the finite state machine given in the state diagram
Table 7.4 shown in Fig. 7.26.
a) Minimize machine M.
oO)
b) Find a (minimal) distinguishing string for each given pair
0 1 Q | of states: (1) 53, 563 (11) 53, 84; and (ili) $1, 5.
Sy S4 Ss] 0 1
s. | 53 83} 1 O
s3 |S; Sa} 1 O
54 S| SZ 0 1
S5 S3 S3 1 0
(a)
@
0 ] 0 1
s} | Se 83 | 0 O
$2 | S5 84 |Q 1
$3 | 8 So] 1]
S54 54 53 1 0
$5} 82 Sg |O 1
S56 S4 S56 0 0
Figure 7.26
(b)
@
0 1 QO 1
Ss, | S 83 | 0 O
S52 53 S| 0 O
$3 | 8S S,|O O
S4 S57 S4 0 0
Ss | S56 S7 | 0 O
S56 S55 S52 1 0
S7 S4 S| 0 0
(9)
7.6
Summary and Historical Review
Once again the relation concept surfaces. In Chapter 5 this idea was introduced as a gen-
eralization of the function. Here in Chapter 7 we concentrated on relations and the special
properties: reflexive, symmetric, antisymmetric, and transitive. As a result we focused on
two special kinds of relations: partial orders and equivalence relations.
76 Summary and Historical Review 377
A relation & on a set A is a partial order, making A into a poset, if &% is reflexive,
antisymmetric, and transitive. Such a relation generalizes the familiar “less than or equal
to” relation on the real numbers. Try to imagine calculus, or even elementary algebra,
without it! Or take a simple computer program and see what happens if the program is
entered into the computer haphazardly, permuting the order of the statements. Order is
with us wherever we turn. We have grown so accustomed to it that we sometimes take it
for granted. The origins of the subject of partially ordered sets (and lattices) came about
during the nineteenth century in the work of George Boole (1815-1864), Richard Dedekind
(1831-1916), Charles Sanders Peirce (1839-1914), and Ernst Schréder (1841-1902). The
work of Garrett Birkhoff (1911-1996) in the 1930s, however, is where the initial work on
partially ordered sets and lattices was developed to the point where these areas emerged as
subjects in their own right.
For a finite poset, the Hasse diagram, a special type of directed graph, provides a pictorial
representation of the order defined by the poset; it also proves useful when a total order,
including the given partial order, is needed. These diagrams are named for the German
number theorist Helmut Hasse (1898-1979). He introduced them in his textbook Héhere
Algebra (published in 1926) as an aid in the study of the solutions of polynomial equations.
The method we employed to derive a total order from a partial order is called topological
sorting and it is used in the solution of PERT (Program Evaluation and Review Technique)
networks. As mentioned earlier, this method was developed and first used by the U.S. Navy.
Although the equivalence relation differs from the partial order in only one property,
it is quite different in structure and application. We make no attempt to trace the origin
of the equivalence relation, but the ideas behind the reflexive, symmetric, and transitive
properties can be found in / Principii di Geometria (1889), the work of the Italian mathe-
matician Giuseppe Peano (1858-1932). The work of Carl! Friedrich Gauss (1777-1855) on
congruence, which he developed in the 1790s, also utilizes these ideas in spirit, if not in
name.
Giuseppe Peano (1858-1932) Carl Friedrich Gauss (1777-1855)
Basically, an equivalence relation & on a set A generalizes equality; it induces a char-
acteristic of “sameness” among the elements of A. This “‘sameness” notion then causes the
set A to be partitioned into subsets called equivalence classes. Conversely, we find that a
partition of a set A induces an equivalence relation on A. The partition of a set arises in
many places in mathematics and computer science. In computer science many searching
378 Chapter 7 Relations: The Second Time Around
algorithms rely on a technique that successively reduces the size of a given set A that is
being searched. By partitioning A into smaller and smaller subsets, we apply the searching
procedure in a more efficient manner. Each successive partition refines its predecessor, the
key needed, for example, in the minimization process for finite state machines.
Throughout the chapter we emphasized the interplay between relations, directed graphs,
and (0, 1)-matrices. These matrices provide a rectangular array of information about a
relation, or graph, and prove useful in certain calculations. Storing information like this, in
rectangular arrays and in consecutive memory locations, has been practiced in computer
science since the late 1940s and early 1950s. For more on the historical background of such
considerations, consult pages 456-462 of D. E. Knuth [3]. Another way to store information
about a graph is the adjacency list representation. (See Supplementary Exercise 11.) In the
study of data structures, linked lists and doubly linked lists are prominent in implementing
such a representation. For more on this, consult the text by A. V. Aho, J. E. Hopcroft, and
J.D. Ullman [1].
With regard to graph theory, we are in an area of mathematics that dates back to 1736
when the Swiss mathematician Leonhard Euler (1707-1783) solved the problem of the
seven bridges of Kénigsberg. Since then, much more has evolved in this area, especially in
conjunction with data structures in computer science.
For similar coverage of some of the topics in this chapter, see Chapter 3 of D. F. Stanat
and D. F. McAllister [6]. An interesting presentation of the “Equivalence Problem” can be
found on pages 353-355 of D. E. Knuth [3] for those wanting more information on the role
of the computer in conjunction with the concept of the equivalence relation.
The early work on the development of the minimization process can be found in the
paper by E. F Moore [5], which builds upon prior ideas of D. A. Huffman [2]. Chapter 10
of Z. Kohavi [4] covers the minimization process for different types of finite state machines
and includes some hardware considerations in their design.
REFERENCES
1. Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffrey D. Data Structures and Algorithms.
Reading, Mass.: Addison-Wesley, 1983.
2. Huffman, David A. “The Synthesis of Sequential Switching Circuits.” Journal of the Franklin
Institute 257, no. 3: pp. 161-190; no. 4: pp. 275-303, 1954.
3. Knuth, Donald E. The Art of Computer Programming, 2nd ed., Volume 1, Fundamental Algo-
rithms. Reading, Mass.: Addison-Wesley, 1973.
4. Kohavi, Zvi. Switching and Finite Automata Theory, 2nd ed. New York: McGraw-Hill, 1978.
5. Moore, E. F. ““Gedanken-experiments on Sequential Machines.” Automata Studies, Annals of
Mathematical Studies, no. 34: pp. 129-153. Princeton, N.J.: Princeton University Press, 1956.
6. Stanat, Donald F., and McAllister, David F. Discrete Mathematics in Computer Science. Engle-
wood Cliffs, N.J.: Prentice-Hall, 1977.
b) r R, is reflexive on A if and only if each &R, is reflex-
SUPPLEMENTARY EXERCISES ive on A.
z€
1. Let A be a set and / an index set where, for each i € 7, ®, 2. Repeat Exercise | with “reflexive” replaced by (1) symmet-
is arelation on A. Prove or disprove each of the following. ric; ii) antisymmetric, (iii) transitive.
a) U R, is reflexive on A if and only if each *, is reflex- 3. Fora set A, let R; and R2 be symmetric relations on A. If
te
ive on A. R, o Ry C Ry oR), prove that Ry oR. = RoR).
Supplementary Exercises 379
4, For each of the following relations on the set specified, made up of an adjacency list for each vertex v and an index list.
determine whether the relation is reflexive, symmetric, anti- For the graph shown in Fig. 7.27, the representation is given by
symmetric, or transitive. Also determine whether it is a partial the two lists in Table 7.5.
order or an equivalence relation, and, if the latter, describe the
partition induced by the relation.
a) & is the relation on Q where a KR b if |a — b| < 1.
b) Let T be the set of all triangles in the plane. For
th, t € T, define t; Rt if t), tf have the same area.
c) For T as in part (b), define R by t; RK h if at least two
sides of t; are contained within the perimeter of th.
d) Let A = {1, 2, 3, 4, 5, 6, 7}. Define R on A by x R y
if xy > 10.
Figure 7.27
5. For sets A, B, and C with relations AR, C A X B and
Ry C BX C, prove or disprove that (R; o R2)° = KS o Ri. Table 7.5
6. For aset A, let C = {P,|P, is a partition of A}. Define rela- Adjacency List Index List
tion R on C by P; KR P; if P; < P; —thatis, P; is a refinement
1 ] 1 1
a) Verify that & is a partial order on C. 2 2 2 4
b) For A = {1, 2, 3, 4, 5}, let P;, 1 <i < 4, be the follow-
3 3 3 5
ing partitions: P;: {1, 2}, {3, 4,5}; Po: {1, 2}, {3, 4}, {5}; 4 6 4 7
P3: {1}, {2}, (3,4, 5}; Pa: {1,2}, {3}, {4}, {5}. Draw the 5 ] 5 9
Hasse diagram forC = {P,|1 <i <4}, where C is partially 6 6 6 9
ordered by refinement. 7 3 7 11
7, Give an example of a poset with 5 minimal (maximal) ele- 8 5 8 11
ments but no least (greatest) element. 9 2
10 7
8. Let A = {1, 2. 3, 4, 5, 6} X {1, 2, 3, 4, 5, 6}. Define
R on
A by (11, yi) R (x2, ya), iPaxaiyi = x2 yr.
a) Verify that & is an equivalence relation on A. For each vertex v in the graph, we list, preferably in numer-
b) Determine the equivalence classes [(1, 1)], [(2, 2)], ical order, each vertex w that is adjacent from v. Hence for 1,
[(3, 2)], and [(4, 3)]. we list 1, 2, 3 as the first three adjacencies in our adjacency list.
Next to 2 in the index list we place a 4, which tells us where
9. If the complete graph K,, has 45 edges, what is n?
to start looking in the adjacency list for the adjacencies from 2.
10. Let# = { f: Z* > R} —thatis, F is the set of all functions Since there is a 5 to the right of 3 in the index list, we know
with domain Z* and codomain R. that the only adjacency from 2 is 6. Likewise, the 7 to the right
a) Define the relation R on ¥ by ge RA, for g,h eF, if of 4 in the index list directs us to the seventh entry in the adja-
g is dominated by A and A is dominated by g — that is, cency list —namely, 3 — and we find that vertex 4 is adjacent
g € OCA). (See Exercises 14, 15 for Section 5.7.) Prove to vertices 3 (the seventh vertex in the adjacency list) and 5 (the
that R is an equivalence relation on #. eighth vertex in the adjacency list). We stop at vertex 5 because
of the 9 to the right of vertex 5 in the index list. The 9’s in the
b) For f €&, let [f] denote the equivalence class of f
index list next to 5 and 6 indicate that no vertex 1s adjacent from
for the relation & of part (a). Let ¥’ be the set of equiva-
vertex 5. In a similar way, the 11’s next to 7 and 8 in the index
lence classes induced by &. Define the relation F on #’ by
list tell us that vertex 7 is not adjacent to any vertex in the given
[el ¥ [A], for [g}, [2] € #’, if g is dominated by h. Verify
directed graph.
that ¥ is a partial order. In general, this method provides an easy way to determine
c) For &® in part (a), let i, fi: ho € & with fis ha E [f]. the vertices adjacent from a vertex v. They are listed in the
If fi + fo: Z* > R is defined by (fi + fo)(a) = fi(n) + positions index(v), index(v) + 1,..., index(v + 1) — 1 of the
fo(n), forn € Z*, prove or disprove that f; + fo € [f]. adjacency list.
11. We have seen that the adjacency matrix can be used to Finally, the last pair of entries in the index list — namely, 8
represent a graph. However, this method proves to be rather in- and 11 — is a “phantom” that indicates where the adjacency list
efficient when there are many 0’s (that is, few edges) present. A would pick up from if there were an eighth vertex in the graph.
better method uses the adjacency list representation, which is Represent each of the graphs in Fig. 7.28 in this manner.
380 Chapter 7 Relations: The Second Time Around
(a)
Figure 7.28
12. The adjacency list representation of a directed graph G is Table 7.7
given by the lists in Table 7.6. Construct G from this represen-
tation. v @
Table 7.6 0 1 0 ]
Adjacency List Index List
Sy] S7 S6 ] 0
1 2 l 1 S52 S7 S7 0 0
2 3 2 4 $3 $7 $2 1 0
3 6 3 5 54 S§2 S53 0 0
4 3 4 5 $5 83 $7 0 0
5 3 5 8 56 S4 S| 0 0
6 4 6 10 S7 $3 55 1 0
7 5 7 10 Sg S7 $3 0 0
8 3 8 10
9 6
b) For all 2 <n < 35, show that the Hasse diagram for the
13. Let G be an undirected graph with vertex set V. Define the set of positive-integer divisors of n looks like one of the
relation 2 on V by v Rw if v = w orif there is a path from v nine diagrams in part (a). (Ignore the numbers at the ver-
to w (or from w to v since G is undirected). (a) Prove that R tices and concentrate on the structure given by the vertices
is an equivalence relation on V. (b) What can we say about the and edges.) What happens for n = 36?
associated partition?
c) For n € Zt, t(n) = the number of positive-integer di-
14, a) For the finite state machine given in Table 7.7, determine visors of n. (See Supplementary Exercise 32 in Chapter 5.)
a minimal machine that is equivalent to it. Let m,n € Z* and S, T be the sets of all positive-integer
b) Find a minimal string that distinguishes states s4 and s¢. divisors of m, n, respectively. The results of parts (a) and
(b) imply that if the Hasse diagrams of S, T are structurally
15, At the computer center Maria is faced with running 10 com-
the same, then t(m) = t(n). But is the converse true?
puter programs which, because of priorities, are restricted by
the following conditions: (a) 10 > 8, 3; (b) 8 > 7; (c) 7>5; d) Show that each Hasse diagram in part (a) is a lattice if we
(d) 3 > 9, 6; (e) 6> 4, 1; (1) 9 > 4, 5; (g) 4,5, 1 > 2; where, define glb{x, y} = gcd(x, y) and lub{x, y} = Icm(x, y).
for example, 10 > 8, 3 means that program number 10 must be 17. Let U denote the set of all points in and on the unit square
run before programs 8 and 3. Determine an order for running shown in Fig. 7.29. Thatis,U = {(x, y|O<x <1,0<y< ]}.
these programs so that the priorities are satisfied. Define the relation R on U by (a, b) R (c, d) if (1) (a, b) =
(c, d),or (2) b = danda = Oandc = 1,or(3)b = danda = |
16. a) Draw the Hasse diagram for the set of positive inte-
and c = 0.
ger divisors of (i) 2; (ii) 4; (111) 6; (iv) 8; (v) 12; (vi) 16;
(vii) 24; (viii) 30; (ix) 32. a) Verify that & is an equivalence relation on U.
Supplementary Exercises 381
(A, ©), find two maximal chains. How many such maximal
chains are there for this poset?
(0, 1) (1, 1) d) IfU = {1, 2,3, ..., 2}, how many maximal chains are
there in the poset (PCU), C)?
22. For # # C C A, let (C, &’) be a maximal chain in the poset
(A, KR), where R’ = (C X C) OR. If the elements of C are or-
dered as c; R’ cp R’--- RK’ cy, prove that cy is a minimal ele-
ment in (A, 9) and that c, is maximal in (A, R).
(0, 0) (1, 0)
23. Let (A, %) be a poset in which the length of a longest
Figure 7.29
(maximal) chain is n > 2. Let M be the set of all maximal ele-
ments in (A, %), and lett B= A-— M. TER’ = (BX BN AR,
prove that the length of a longest chain in (B, R’) isn — 1.
b) List the ordered pairs in the equivalence classes
24. Let (A, %) be a poset, and let ACCA. TF (CX C)N
[(0.3, 0.7)}, [(0.5, 0)], [(0.4. 1)], [(0, 0.6)}, [C1, 0.2)). For
R = G, then for all distinctx, y € C wehavex Ay andy Ax.
O0O<a<1,0<5<1, how many ordered pairs are in
The elements of C are said to form an antichain in the poset
[(a, 5)]?
(A, KR).
c) If we “glue together” the ordered pairs in each equiva-
a) Find an antichain with three elements for the poset given
lence class, what type of surface comes about?
in the Hasse diagram of Fig. 7.18(d). Determine a largest
18. a) ForU = {1, 2, 3}, let A = PU). Define the relation R antichain containing the element 6. Determine a largest
on A by B R Cif B C C. How many ordered pairs are there antichain for this poset.
in the relation R?
b) If U = {1, 2, 3, 4}, let A = PCU). Find two different
b) Answer part (a) forU = {1, 2, 3, 4}. antichains for the poset (A, ©). How many elements occur
c) Generalize the results of parts (a) and (b). in a largest antichain for this poset?
19. Forn € Z*, lett = {1, 2, 3, ..., n}. Define
the relation c) Prove that in any poset (A, &), the set of all maximal
on P(U) by AR B if A ¢ B and B ¥ A. How many ordered elements and the set of all minimal elements are antichains.
pairs are there in this relation? 25. Let (A, %) be a poset in which the length of a longest chain
20. Let A be a finite nonempty set with B C A (B fixed), and is n. Use mathematical induction to prove that the elements of
|A| =n, |B| = m. Define the relation R on P(A) by X RY, A can be partitioned into n antichains C,, C2, ..., C, (where
for X, Y CA,if XN B = Y OB. Then & is an equivalence re- C,AC, =, fori <i<j <n).
lation, as verified in Exercise 10 of Section 7.4. (a) How many
26. a) Inhow many ways can one totally order the partial order
equivalence classes are in the partition of P(A) induced by R?
of positive-integer divisors of 96?
(b) How many subsets of A are in each equivalence class of the
partition induced by 2? b) How many of the total orders in part (a) start with
96 > 32?
21. For A # 9, let (A, %) be a poset, and let 6 # B C A such
that RR’ = (BX BY OR. If (B, R’) is totally ordered, we call c) How many of the total orders in part (a) end with 3 > 1?
(B, R’) a chain in (A, R). In the case where B is finite, we may d) How many of the total orders in part (a) start with
order the elements ofB by b} R’ bo R’ bz FR’ - - + R’ b,_ R’ b,, 96 > 32 and end with 3 > 1?
and say that the chain has length n. A chain (of length n) e) How many of the total orders in part (a) start with
is called maximal if there is no element a € A where a ¢
96 > 48 > 32 > 16?
{b), bo, b3,..., b,} anda Rb, b, Ra, or b Ra KR b, 41, for
some 1 <i<n-— 1. 27. Let n be a fixed positive integer and let A, = {0, 1,
..., a} ON. (a) How many edges are there in the Hasse di-
a) Find two chains of length 3 for the poset given by the
agram for the total order (A,, <), where “<” is the ordinary
Hasse diagram in Fig. 7.20, Find a maximal chain for this “tess than or equal to” relation? (b) In how many ways can the
poset. How many such maximal chains does it have? edges in the Hasse diagram of part (a) be partitioned so that the
b) For the poset given by the Hasse diagram in Fig. 7.18(d), edges in each cell (of the partition) provide a path (of one or
find two maximal chains of different lengths. What is the more edges)? (c) In how many ways can the edges in the Hasse
length of a longest (maximal) chain for this poset? diagram for (Az, <) be partitioned so that the edges in each
c) Let U= {1, 2, 3,4} and A = PAL). For the poset cell (of the partition) provide a path (of one or more edges) and
one of the cells is {(3, 4), (4, 5), (5, 6), (6, 7)}?
PART
2
FURTHER
TOPICS IN
ENUMERATION
The Principle
of Inclusion
and Exclusion
W: now return to the topic of enumeration as we investigate the Principle of Inclusion
and Exclusion. Extending the ideas in the counting problems on Venn diagrams in
Chapter 3, this principle will assist us in establishing the formula we conjectured in Section
5.3 for the number of onto functions f: A > B, where A, B are finite (nonempty) sets.
Other applications of this principle will demonstrate its versatile nature in combinatorial
mathematics.
8.1
The Principle of Inclusion and Exclusion
In this section we develop some notation for stating this new counting principle. Then
we establish the principle by a combinatorial argument. Following this, a wide range of
examples demonstrate how this principle may be applied.
We shall motivate the Principle of Inclusion and Exclusion with a series of three exam-
ples, the first two of which will be reminiscent of the work we did with counting and Venn
diagrams in Section 3.3.
Let S represent the set of 100 students enrolled in the freshman engineering program at Cen-
EXAMPLE 8.1
tral College. Then |S| = 100. Now let c), cz denote the following conditions (or properties)
satisfied by some of the elements of S:
cy: Astudent at Central College is among the 100 students in the freshman engineering
program and is enrolled in Freshman Composition.
co: Astudent at Central College is among the 100 students in the freshman engineering
program and is enrolled in Introduction to Economics.
Suppose that 35 of these 100 students are enrolled in Freshman Composition and that
30 of them are enrolled in Introduction to Economics. We shall denote this by
N(c1}) =35 and N(c2) = 30.
If nine of these 100 students are enrolled in both Freshman Composition and Introduction
to Economics then we write N(c,c2) = 9.
385
386 Chapter 8 The Principle of Inclusion and Exclusion
Further, of these 100 students, there are 100 ~ 35 = 65 who are not taking Freshman
Composition. Denoting |S| by N, we can designate this by writing N(c;) = N — N(c)).
In a similar way we designate that there are N(c2) = N — N(c2) = 100 — 30 = 70 of
these students who are not taking Introduction to Economics. The number who are taking
Freshman Composition and who are not taking Introduction to Economics is N(c;¢2) =
N(c1) — N(e;c2) = 35 — 9 = 26. Likewise, of these 100 students, there are N(¢,c2) =
N(e2) — N(c)c2) = 30 — 9 = 21 who are enrolled in Introduction to Economics but not in
Freshman Composition. Of particular interest are those students (from among these 100
freshmen) who are taking neither Freshman Composition nor Introduction to Economics —
that is, they are not taking Freshman Composition and they are also nor taking Introduction
to Economics. Their number is N(¢)¢2). And since N(c,;) = N(c1¢e2) + N(C1C2), we learn
that NV (c\C2) = N(c\) — N(e€\c2) = 65 — 21 = 44.
The preceding observations also demonstrate that
N(cyc2) = N(c1) — N(eye2) = [N ~ N(c1)] — [N(c2) — N(erc2)]
= N ~ N(cy) — N(c2) + N(cie2) = N — [N(e1) + N(c2)] + N(cic2)
= 100 — [35 + 30] + 9 = 44, as we saw above.
From the Venn diagram in Fig. 8.1, we see that if N(c)) denotes the number of elements
of S in the left-hand circle and N(c2) denotes the number in the right-hand circle, then
N(c;¢2) is the number of these elements from S in the overlap, while N (c;c2) counts those
elements of S that are outside the union of these two circles. Consequently, we see once
again — this time from the figure — that
N(€\€2) = N ~[N(c1) + N(e2)] + N(c1c2),
where the last term is added on because it was eliminated twice in the term [ VN (c,) + N(c2)].
(Also, at this point, the reader may wish to look back at the second formula following
Example 3.25 to find the same result presented with a different notation.)
N(C4€)
N(C4C>)
Figure 8.1
[Before we advance to our next example where we will introduce a third condition, let us
note that N(c)C2) is not the same as N(c1C2). For N(c,¢3) = N — N(cyc2) = 100 —9 =
91, in this example, while N(¢c;c2) = 44, as we learned earlier. However, N(¢; or ¢2) =
N(eqc2) = 91 = 65+ 70 — 44 = N(¢C}) + N(@o) — N(€1e2).]
We start with the same 100 students as in Example 8.1 and the same conditions c), c2, but
EXAMPLE 8.2
now we consider a third condition, given as follows:
c3: Astudent at Central College is among the 100 students in the freshman engineering
program and is enrolled in Fundamentals of Computer Programming.
8.1 The Principle of Inclusion and Exclusion 387
It is still the case that N(c,}) = 35, N(e2) = 30, and N(c)c2) = 9, but now we are also given
that N(c3) = 30, N(c)c3) = 11, N(c2c3) = 10, and N(c)c2c3) = 5 (that is, there are five
of these 100 freshmen who are taking Freshman Composition, Introduction to Economics,
and Fundamentals of Computer Programming). Looking to Fig. 8.2, we learn that
N(c\¢2¢3) = N —[N(e1) + N(e2) + N(e3)] + [N(e1¢2) + N(c1e3) + N(c203)]
— N{c\c203).
So here we have N(¢;¢2¢3) = 100 — [35 + 30 + 30] + [9 + 11 + 10] — 5 = 30. That is,
out of these 100 students there are 30 who are not enrolled in any of the courses:
(i) Freshman Composition; (ii) Introduction to Economics; or (iii) Fundamentals of Com-
puter Programming.
[We also learn here that V(¢c3) = 70 = 100 — 30 = N — N(e3), N(€1¢3) = 46 = 100 —
[35 + 30] +11 = N —[N(e}) + N(c3)] + N(cie3), and N (€2¢3) = 50 = 100 — [30 + 30]
+ 10= N —[N(c2) + N(e3)] + N(c2¢3). Furthermore, we note the similarity here with the
result for |A M BM C| given in the second formula following Example 3.26.]
N(C4C5C3)
N(c,CC3)
J
N(c> C3)
Figure 8.2
Based on the results in the previous two examples we may now feel that for a given finite
EXAMPLE 8.3
set $ (with |S| = N) and four conditions c;, c2, ¢3, cg we should have
N(€\€2¢3¢4) = N — [N(c1) + N(c2) + N(c3) + N(ca)] (*)
+ [N(c1¢2) + N(cye3) + N(e1¢4) + N(c2€3) + N(c2c4) + N(€3¢4)]
~ [N(c1e2¢3) + N(cc2¢4) + N(eic3¢e4) + N(c2030¢4)]
+ N(c1€2€3€4).
To show that this is the case we consider an arbitrary element x from S and show that it is
counted the same number of times on both sides of the above equation.
0) If x satisfies none of the four conditions, then it is counted once on the left side of
Eq. (*) [in N(¢)c2¢3¢4)], and once on the right side of Eq. (*) [in NV].
1) If x satisfies only one of the conditions, say c,, then it is not counted at all on the left
side of Eq. (*). But on the right side of Eq. (*), x is counted once in N and once in
N(c)), for a total of 1 — 1 = 0 times.
388 Chapter 8 The Principle of Inclusion and Exclusion
2) Now suppose that x satisfies conditions c2, cq but does not satisfy conditions ¢;, c3.
Once again x is not counted on the left side of Eq. (*). For the right side of Eq. (*),
x is counted once in NV, once in each of N (cz) and N(c4), and then once in N(c2¢4),
totaling 1 — [1 +1] +1=1-({) + () =0 times.
3) Continuing with the case for three conditions, we'll suppose here that x satisfies
conditions c,, c2, and c4, but not c3. As in the previous two cases, x is not counted
on the left side of Eq. (*). On the right side of Eq. (*), x is counted once in N,
once in each of N(c)), N(c2), and N(cq4), once in each of N(c)c2), N(c;,c4), and
N (c2c4), and, finally, once in N(c;c2c4). So on the right side of Eq. (*), x is counted
1—[fl1+141)+fl14+1+4+1])~—1=1-—() + (3) — Q) = 0 times, in total.
4) Finally, if x satisfies all four of the conditions ¢c;. cz, c3, cq, then once again it is not
counted on the left side of Eq. (*). On the right side of Eq. (*), x is counted once for
each of the 16 terms on the right side of this equation — for atotalof 1 —[1+1+1+
I+ fL+1414+14+141)-(141414+041=1-()+0-@+Q@=
O times.
Consequently, from these preceding five cases we have shown that the two sides
of Eq. (*) count the same elements from S, and this provides a combinatorial proof
for the formula for N (€;C2€3C4).
So now we shall reconsider the situation in Example 8.2 and introduce a fourth condition
as follows:
c4: Astudent at Central College is among the 100 students in the freshman engineering
program and is enrolled in Introduction to Design.
We already know that N(c,) = 35, N(c2) = 30, N(c3) = 30, N(cy¢e2) = 9, N(e1¢3) = 11,
N(ce2¢3) = 10, and N{c,c2c3) = 5. If N(c4) = 41, N(eyc4) = 13, N(coc4) = 14, N(030€4)
= 10, N(c,c2¢e4) = 6, N(c1¢3¢4) = 6, N(c2¢3¢4) = 6, and N(c\c2¢3c4) = 4, then, using
the equation we derived above, it follows that N(¢,¢2¢3¢4) = 100 — [35 + 30+ 304 41]
+[9+11+134+ 10+ 144+ 10] —-[5+6+6+46]+4= 100—- 136+ 67—23+4=
12. Thus, of the 100 students in the freshman engineering program at Central College,
there are 12 who are not taking any of the four courses: Freshman Composition, Intro-
duction to Economics, Fundamentals of Computer Programming, or Introduction to De-
sign.
If we are interested in the number (from these 100 students) who are taking Fresh-
man Composition, but none of the other three courses, then we should want to compute
N(c;€2€3¢4). To do so we start by observing that
N(€20€3€4) = N(c1€203C4) + N(€1€2€3€4),
which can be established by an argument similar to the one above for N (¢;C2c¢3¢4). This
then leads us to
N(cyC2¢3C4) = N(C2¢3C4) — N(€1€2€3C4).
Using the result in Example 8.2 we find that
N(€2¢3¢4) = N — [N(c2) + N(c3) + N(c4)] + [N (203) + N(c2c4) + N(c3¢4)]
— N(c2¢3¢4)
= 100 — [30+ 30+ 41] + [10+ 144 10]
— 6 = 27, and
N (cyOo0304) = N (00304) — N(€10r0304) = 27 — 12 = 15.
8.1 The Principle of Inclusion and Exclusion 389
So there are 15 students in this set of 100 who are taking Freshman Composition, but none
of the other courses: Introduction to Economics, Fundamentals of Computer Programming,
or Introduction to Design.
Further, we also observe that
N(ceyc203¢4) = N(e2€3C4) — N(€1€2€3C4)
= {N —[N(c2) + N(cs) + N(ca)] + [N (C203) + N (Crea) + N(C3¢a)]
~ N(cx¢3¢4)} — {N —[N(c1) + N(c2) + N(c3) + N(ca)]
+ [N(c1e2) + N(e1e3) + N(e1e4) + N(c2€3) + N(c2c4) + N(c3¢4)]
— [N(e1e2€3) + N(c1e2c4) + N(c103¢4) + N(c203¢4)] + N(cic2c3c4)}, or
N(c1€2¢3€4) = N(cy) — [N(cie2) + N(cic3) + N(c1c4)]
+ [N(c1c2¢3) + N(cic2¢4) + N(c103¢4)] — N(e1c2€3¢4).
So here N(c)¢203¢4) = 35 — [9+ 114 13] 4+ [5+6+4 6] —4 = 35 —33417-4= 15,
as we found above.
Having seen the results in Examples 8.1, 8.2, and 8.3, now it is time for us to generalize
these results and establish the Principle of Inclusion and Exclusion. To do so we once again
let S be a set with |S| = NV, and we let c), co,...,c¢;, be a collection oft conditions or
properties — each of which may be satisfied by some of the elements of S$. Some elements
of S may satisfy more than one of the conditions, whereas others may not satisfy any of
them. For all 1 <i < +t, N(c;) will denote the number of elements in S that satisfy condition
c;. (Elements of S are counted here when they satisfy only condition c;, as well as when
they satisfy c; and other conditions c;, for 7 #7.) For all 7, j € {1, 2, 3,..., t} where
i # j, N(c;c;) will denote the number of elements in S that satisfy both of the conditions
c;, cj, and perhaps some others. | NV (c;c,) does not count the elements of S that satisfy only
c;, c;.] Continuing, if 1 <i, j, k < t are three distinct integers, then N(c;c;c,) denotes the
number of elements in S satisfying, perhaps among others, each of the conditions ¢;, cj,
and Ck.
For each 1 <i <t, N(c;) = N — N(c;) denotes the number of elements in S that do
not satisfy condition c;. If 1 <i, j <t withi # j, N(¢;c;) = the number of elements in S$
that do not satisfy either of the conditions ¢; or c;. [This is not the same as N(€;C;), as we
observed at the end of Example 8.1.]
With the necessary preliminaries now in hand we state the following theorem.
THEOREM 8.1 The Principle of Inclusion and Exclusion. Consider a set S, with |S| = N, and condi-
tions c;, 1 <i <t, each of which may be satisfied by some of the elements of S. The
number of elements of S that satisfy none of the conditions c;, 1 <i <1, is denoted by
N = N(€1C2¢3-- - €) where
N= N-—[N(c\)
+ N(x) + N(e3) ++ + NCC] (1)
+ [N(cic2) + N(cye3) +--+ + N(cie;) + N(c2¢3) ++ + N(cr-1¢7)]
— [N(eic2c3) + N(cye2e4) +++ + + N(cye20;) + N(e1¢3¢4) +--+:
+ N(c1e3¢;) +++ + N(e;-2¢;-1¢1)] +++ + (1)
N (ey e203 + + + er),
390 Chapter 8 The Principle of Inclusion and Exclusion
or
N=N- Qo NG)+ DO NGie)-— DD Neejedto-
l<r<z l<i<j<r l<i<j<k<t
+ (-1)N(e102¢3 ++ + cr).
Proof: Although this result can be established by applying the Principle of Mathematical
Induction to the number ¢ of conditions, we shall give a combinatorial proof. The argument
will be reminiscent of the ideas we saw in Example 8.3 in establishing the formula for
N (€1€2€3C4).
For each x € S we show that x contributes the same count, either 0 or 1, to each side of
Eq. (2). _
If x satisfies none of the conditions, then x is counted once in N and once in N, but not
in any of the other terms in Eq. (2). Consequently, x contributes a count of 1 to each side
of the equation.
The other possibility is that x satisfies exactly r of the conditions where 1 <r <t. In
this case x contributes nothing to N. But on the right-hand side of Eq. (2), x is counted
(1) One time in N.
(2) r times in » N(c;). (Once for each of the r conditions.)
l<i<t
r . . . <a:
(3) ( ) times in > N(c;c;). (Once for each pair of conditions selected from
2 i<i<j<t
the r conditions it satisfies.)
r . .
(4) 3 times in » N(c;c;cx). (Why?)
l<i<j<k<r
CY
r ;
(r+1) ( ) = | time in > N(c;,€i, «++ ¢;,), Where the summation is taken over all
r
selections of size r from the f conditions.
Consequently, on the right-hand side of Eq. (2), x is counted
(2)r <1 cor =o = osimes,
r r r
3 tear
r+ (S)—(G)
2
by the binomial theorem. Therefore, the two sides of Eq. (2) count the same elements from
S, and the equality is verified.
An immediate corollary of this principle is given as follows:
COROLLARY 8.1 Under the hypotheses of Theorem 8.1, the number of elements in S that satisfy at least one
of the conditions c;, where 1 <i <1, is given by N(c, orc or ... orc;) =N—N.
Before solving some examples, we examine some further notation for simplifying the
statement of Theorem 8.1.
8.1. The Principle of Inclusion and Exclusion 391
We write
So=N,
S) =[N(c1) + N(c2) +--+ + Nee),
Sp = [N(cic2) + N(cye3) +--+ + Neier) + N(c2¢3) +--+ + N(er-1e1)],
and, in general,
Sie = YO N(Cney + Cy LS Kk SH,
where the summation is taken over all selections of size k from the collection of t conditions.
Hence S; has (;,.) summands in it.
Using this notation we can rewrite the result in Eq. (2) as
N =So—S; +8. —S3t---+(-D'S;.
Now let us look at how this principle is used to solve certain enumeration problems.
Determine the number of positive integers n where 1 <n < 100 and a is not divisible by
EXAMPLE 8.4
2, 3, or5.
Here S = {1, 2, 3,..., 100} and N = 100. Forn € S.n satisfies
a) condition c, if 7 is divisible by 2,
b) condition c2 if n is divisible by 3, and
c) condition c3 if n is divisible by 5.
Then the answer to this problem is N(¢|C2C3).
As in Section 5.2 we use the notation |r| to denote the greatest integer less than or equal
to r, for any real number r. This function proves to be helpful in this problem as we find
that
N(c1) = |100/2| = 50 [since the 50 (= |100/2]) positive integers 2, 4, 6, 8,..., 96,
98 (= 2 - 49), 100 (= 2 - 50) are divisible by 2];
N(cz) = |100/3| = [33 1/3] = 33 [since the 33 (= |100/3]) positive integers 3, 6, 9,
12,..., 96 (= 3 - 32), 99 (= 3 - 33) are divisible by 3];
N(c3) = [100/5]| = 20;
N(c1c2) = [100/6] = 16 [since there are 16 (= |100/6]) elements in S that are divisible
by both 2 and 3 hence
— divisible by Icm(2, 3) = 2-3 = 6];
N(c)¢3) = | 100/10] = 10;
N(c2c3) = [100/15] = 6; and
N(cye2¢3) = 100/30] = 3.
Applying the Principle of Inclusion and Exclusion, we find that
N (€4€2€3) = So — S, + So — S83 = N ~[N(c1) + N(c2) + N(e3)]
+ [N(cyc2) + N(e1¢3) + N(c2¢3)] — N(e1¢2¢3)
= 100 — [50 + 33 + 20] + [16+ 10+ 6] — 3 = 26.
392 Chapter 8 The Principle of Inclusion and Exclusion
(These 26 numbers are 1, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 49, 53, 59, 61, 67, 71,
73,77, 79, 83, 89, 91, and 97.)
In Chapter 1 we found the number of nonnegative integer solutions to the equation
EXAMPLE 8.5 | X) + X2 + x3 + x4 = 18. We now answer the same question with the extra restriction that
x, <7,forall 1 <i <4.
Here S is the set of solutions of x; + x2 + x3 + x4 = 18, with 0 < x; for all 1 <7 < 4.
So |S|= N = So= ("715') = (73):
We say that a solution x;, x2, x3, x4 satisfies condition c;, where 1 <i < 4, if x, > 7 (or
x, > 8). The answer to the problem is then N (€|¢2€3C4).
Here by symmetry N(c1) = N(c2) = N(c3) = N(c4). To compute N(c,), we consider
the integer solutions for x; + x2. + x3 + x4 = 10, with each x, > 0 for all 1 <7 <4. Then
we add 8 to the value of x, and get the solutions of x; + x. + x3 + x4 = 18 that satisfy
condition c;. Hence N(c;) = (jt iO ') = (ja), for each 1 <i <4, and S, = ) (a):
Likewise, N(c;c2) is the number of integer solutions of x; + x2 + x3 + x4 = 2, where
x, > 0 forall 1 <i <4. So N(e;c2) = (At57 ') = (3), and S2 = (3)().
Since N(c,c,c,) = 0 for every selection of three conditions, and N(c)c2c3¢4) = 0, we
have
~o Le 21 4\ (13 AN /5
N (€4€203C4) = So — S| + So — 83 + Sy = (is) _ (C0) + (>) (>) —0+0 = 246.
So of the 1330 nonnegative integer solutions of x; + x2 + x3 + x4 = 18, only 246 of them
satisfy x, <7 foreach 1 <i <4.
Our next example establishes the formula conjectured in Section 5.3 for counting onto
functions.
For finite sets A, B, where |A] =m >n =|B|, let A = {a), a2,..., ay}, B= {b,, hy,
EXAMPLE 8.6
..., b,}, and S = the set of all functions f: A— B. Then N = Sy = |S| =n”.
For all 1 <7 <n, let c, denote the condition on S where a function f: A > B satisfies
c, if b; is not in the range of f. (Note the difference between c; here and c; in Examples
8.4 and 8.5.) Then N(c,) is the number of functions in S that have b; in their range, and
N(€\C2- + €,) counts the number of onto functions f: A > B.
For all 1 <i <n, N(c;) = (n — 1)”, because each element of B, except b,, can be used
as the second component of an ordered pair for a function f: A > B, whose range does not
include b;. Likewise, for all 1 <i < j <n, there are (n — 2)” functions f: A > B whose
range contains neither b, nor b;. From these observations we have S; = [N(c1) + N(e2) +
+--+ N(cy)] = a(n — 1)” = (7)(n — 1)”, and Sy = [N(c1c2) + N(c1¢3) +--+» + N(cien)
+ N(c2¢3) + +--+ N(c2¢n) +--+ + N(cn-1en)] = (5) — 2)". In general, for each
1<k<n,
S = - N(ci¢y
© * Ci) = (J) a"
L<1) <in<- <ip <n
It then follows by the Principle of Inclusion and Exclusion that the number of onto
8.1 The Principle of Inclusion and Exclusion 393
functions from A to B is
N(C1€2€3 -- - Cn) = So — Sy) + Sp — 83 +- + (-1)"Sh
=a (")n— a+ (Yaar (*)a 9
— pf ') —~1)" + ") — 2)" n ( — 3)"
eben nmr = Sen (‘ea - i n o\ HL
nt
{on
=> ry ( Jen — iy”
i=0 not
Before we finish discussing this example, let us note that
i n
S>é (-b ( n—-i Jo ~iy"
i=0
can also be evaluated even if m <n. Furthermore, for m <n, the expression
N(€1C2€3 + ++ Cp)
still counts the number of functions f: A > B, where |A| = m, |B| =n, and each element
of B is in the range of f. But now this number is 0.
For example, suppose that m = 3 < 7 = n. Then N(C1€203 - - - €7) counts the number of
onto functions f: A —> B for |A| = 3 and |B| = 7. We know this number is 0, and we also
find that
7
YEeViG 7-9 = OF - Ye + OS - 4 + G3 - 2+ ML - Go’
= = 343 — 1512 + 2625 — 2240 + 945 — 168 + 7-0-0.
Hence, for all m,n € Z*, ifm <n, then
> a'( ni—t
" Jaa =0.
i=0
We now solve a problem similar to those in Chapter 3 that dealt with Venn diagrams.
In how many ways can the 26 letters of the alphabet be permuted so that none of the patterns
EXAMPLE 8.7 | car, dog, pun, or byte occurs?
Let S§ denote the set of all permutations of the 26 letters. Then |S| = 26! For each
1 <i <4, a permutation in S$ ts said to satisfy condition c, if the permutation contains the
pattern car, dog, pun, or byte, respectively.
In order to compute N (c,), forexample, we count the number of ways the 24 symbols car,
b,d,e, f,.... Psd. 8,t,...,X, y, z can be permuted. So N(c,) = 24!, and in a similar
way we obtain
N(ce2) = N(c3) = 24!, while N(c4) = 23!
For N(c\c2) we deal with the 22 symbols car, dog, b, e, fo h,i,..., m,n, p.g, 8.t,...,
x, y, z, which can be permuted in 22! ways. Hence N(c;cz) = 22!, and comparable calcu-
lations give
N(e1¢3) = N(e2¢3) = 221, Ni(cjeg) = 21, i #4.
394 Chapter 8 The Principle of Inclusion and Exclusion
Furthermore,
N(c}C2¢3) = 20}, N(cjejc4) = 19}, l<i< J< 3,
N (ce \¢2¢3¢4) = 17!
So the number of permutations in S that contain none of the given patterns is
N(€\€2€3C4) = 26! — [3(24!) + 23!] + [3(22!) + 3(21!)] — [20! + 3(19!)] + 17!
Our next example deals with a number theory problem.
For n € Z*, n > 2, let @(n) be the number of positive integers m, where | < m <n and
EXAMPLE 8.8
gcd(m, n) = 1— that is, m, n are relatively prime. This function is known as Euler's phi
function, and it arises in several situations in abstract algebra involving enumeration. We find
that (2) = 1, @(3) = 2, (4) = 2, 6(5) = 4, and $(6) = 2. For each prime p, ¢(p) =
p — 1. We would like to derive a formula for ¢() that is related to n so that we need not
make a case-by-case comparison for each m, 1 < m <n, against the integer n.
The derivation of our formula will use the Principle of Inclusion and Exclusion as in
Example 8.4. We proceed as follows: Forn > 2, use the Fundamental Theorem of Arithmetic
to write n = p}'p,’--- p;', where pj, p2,..., p; are distinct primes and e, > 1, for all
1 <i < +t. Weconsider the case where t = 4. This will be enough to demonstrate the general
idea.
With
S = {1, 2, 3,....n},
we have N = So = |S| =n, and for each 1 <i <4 we say
that k € S satisfies condition c; if k is divisible by p;. For 1 <k <n, ged(k, n) = lifk is
not divisible by any of the primes p;, where 1 <i < 4. Hence @(n) = N(c)€2€3€4).
For each 1 <i < 4, we have N(c;) = n/p;; N(cic;) = n/(pip;), forall 1 <i <j <4.
Also, N(cjcjce) = n/(pi
pj pe), forall 1 <i < j <€<4, and N(e)c2¢3¢4) =
n/(P| P2p3p4). So
@(n) = So — Sy + So — S3 + S4
#1 n nt n n
wn—[r ae ls] + +--+
Pi P4 Pi P2 Pi P3 P3 P4
-| A
eg += i
P1 P2P3 P2P3P4 P1 P2P3P4
1 1 1 1 1
=n|1l—{—-4+---+—]-+ + teeet
Pi P4 P3 Pp2 Pi P3 P3P4
] 1 1
_ ( teset ) + ho |
Pi P2P3 P2P3P4 P\ P2P3P4
———— [Pi P2P3P4 — (p2p3pat pip3pa t+ pip2ps + Prp2Ps)
P\P2P3P4
+ (p3Pp4t+ p2p4a+ pr2p3+ Pipa + Pip3 + Pip2)
— (pat p3t+ pot pi) +1]
[(p1 — 1)(p2 — 1) (p3 — 1) (pa — 1)
P\ P2P3P4
=n? —-1 —P2-1 p—-l pr-1 p-l ps p-l Jee 1
0--).
P\ P2 P3 P4 i
i=] Pi
8.1 The Principle of Inclusion and Exclusion 395
In general, O(n) =n II pin(1 — (1/p)), where the product is taken over all primes p
dividing n. When = p,aprime, (2) = ¢(p) = p[1 — /p)] = p — 1, as we observed
earlier. If n = 23,100, for example, we find that
(23,100) = #(27-3-5?-7-11)
= (23,100)(1 — (1/2))1 — 1/3)) — G/5))0 — G/7))0 — A/1))
= 4800.
The Euler phi function has many interesting properties. We shall investigate some of
them in the exercises for this section and in the Supplementary Exercises.
The next example provides another encounter with the circular arrangements introduced
in Chapter 1.
Six married couples are to be seated at a circular table. In how many ways can they arrange
EXAMPLE 8.9 themselves so that no wife sits next to her husband? (Here, as in Example 1.16, two seating
arrangements are considered the same if one is a rotation of the other.)
For 1 <7 <6, we let c, denote the condition where a seating arrangement has couple i
seated next to each other.
To determine N(c,), for instance, we consider arranging 11 distinct objects — namely,
couple 1 (considered as one object) and the other 10 people. Eleven distinct objects can be
arranged around a circular table in (11 ~ 1)! = 10! ways. However, here N(c;) = 2(10}),
where the 2 takes into account whether the wife in couple 1 is seated to the left or right of
her husband. Similarly, N(c;) = 2(10!), for 2 <i <6, and S, = (°)2(10!).
Continuing, let us now compute N(c;c;), for 1 <i < j < 6. Here we are arranging 10
distinct objects — couple i (considered as one object), couple j (likewise considered as one
object), and the other eight people. Ten distinct objects can be arranged around a circular
table in (10 — 1)! = 9! ways. So here N(c;c;) = 27(9!) because there are two ways for the
wife in couple 7 to be seated next to her husband, and two ways for the wife in couple j to
be seated next to her husband. Consequently, $2 = (5) 27(9!).
Similar reasoning shows us that
N(cye2¢3) = 23(8!). S3 = (§)2°(8!) N(cye203¢4) = 24(7!), Sy = ($)24(71)
N(cic2¢3¢4¢5) = 2°(6!), Ss = (8)2°(6!) NM (creze3C4cs¢6) = 2°(5!), So = (8)2°(5)).
With Sy (the total number of arrangements of the 12 people) = (12 — 1)! = 11!, we find
that the number of arrangements where no couple is seated side by side is
6 6
aa zx } i 6 i :
N(e\c2 toe C6) = Sous; = SD (?)2 (11 _— i)!
i=0 i=O0
= 39,916,800 — 43,545,600 + 21,772,800 — 6,451,200
+ 1,209,600 — 138,240 + 7680
= 12,771,840.
Our final example recalls some of the graph theory we studied in Chapter 7.
In a certain area of the countryside are five villages. An engineer is to devise a system of
EXAMPLE 8.10
two-way roads so that after the system is completed, no village will be isolated. In how
many ways can he do this?
396 Chapter 8 The Principle of Inclusion and Exclusion
Calling the villages a, b, c, d, and e, we seek the number of loop-free undirected graphs
on these vertices, where no vertex is isolated. Consequently, we want to count situations
such as those illustrated in parts (a) and (b) of Fig. 8.3, but not situations such as those
shown in parts (c) and (d).
(a) (b)
Figure 8.3
Let S be the set of loop-free undirected graphs G on V = {a, b, c, d, e}. Then N =
So = |S| = 2! because there are (5) = 10 possible two-way roads for these five villages,
and each road can be either included or excluded.
For each 1 <i <5, let c; be the condition that a system of these roads isolates village
For condition c; village a is isolated, so we consider the six edges (roads) {b, c}, {b, d},
{b, e}, {c, d}, {ec , e}, {d, e}. With two choices for each edge — namely, put the edge in the
graph or leave the edge out
— we find that N(c,) = 2°. Then by symmetry N(c;) = 2° for
all2 <i <5, so S, = (7)2°.
When villages a and b are to be isolated, each of the edges {c, d}, {d, e}, {c, e} may be put
in or left out of our graph. This results in 2° possibilities, so N(c;c2) = 23, and S) = (3)2".
Similar arguments tell us that N(cyc2c3) = 2! and $3 = (3)2'; N(e1c2¢3¢4) = 2 and
S4 = (3)2°; and N(c,c2¢3¢4c5) = 2° and Ss = (2)2°.
Consequently,
— (3)2° = 768.
4. Annually, the 65 members of the maintenance staff spon-
>a ah Ase sor a “Christmas in July” picnic for the 400 summer employees
at their company. For these 65 people, 21 bring hot dogs, 35
1. Let S be a finite set with |S| = N and let c), co, c3, c4 be
bring fried chicken, 28 bring salads, 32 bring desserts, 13 bring
four conditions, each of which may be satisfied by one or more
hot dogs and fried chicken, 10 bring hot dogs and salads, 9
of the elements of S. Prove that N(¢2¢3¢4) = N(c@203€4) +
bring hot dogs and desserts, 12 bring fried chicken and sal-
N (€1€2€304).
ads, 17 bring fried chicken and desserts, 14 bring salads and
2. Establish the Principle of Inclusion and Exclusion by ap- desserts, 4 bring hot dogs, fried chicken, and salads, 6 bring hot
plying the Principle of Mathematical Induction to the number t dogs, fried chicken, and desserts, 5 bring hot dogs, salads, and
of conditions, desserts, 7 bring fried chicken, salads, and desserts, and 2 bring
3. Of the 100 students in Example 8.3, how many are taking all four food items. Those (of the 65) who do not bring any of
(a) Fundamentals of Computer Programming but none of the these four food items are responsible for setting up and cleaning
other three courses; (b) Fundamentals of Computer Program- up for the picnic. How many of the 65 maintenance staff will
ming and Introduction to Economics but neither of the other (a) help to set up and clean up for the picnic? (b) bring only hot
two courses? dogs? (c) bring exactly one food item?
8.2 Generalizations of the Principle 397
5. Determine the number
of positive integersn, 1 <n < 2000, 15. If eight distinct dice are rolled, what is the probability that
that are all six numbers appear?
a) not divisible by 2, 3, or 5 16. How many social security numbers (nine-digit sequences)
b) not divisible by 2, 3, 5, or 7 have each of the digits 1, 3, and 7 appearing at least once?
c) not divisible by 2, 3, or S, but are divisible by 7 17. In how many ways can three x’s, three y's, and three z’s be
arranged so that no consecutive triple of the same letter appears?
6. Determine how many integer solutions there are to
xX) tx.
+ 4x3 +44 = 19, if 18. Frostburg township sponsors four Boy Scout troops, each
with 20 boys. If the head scoutmaster selects 50 of these boys to
a) O<x, forall] <i<4
represent this township at the state jamboree, what is the prob-
b) 0<x, <8 foralll
<i <4 ability that his selection will include at least one boy from each
ce) O< x) $5,054) <6,3<43<57,3<
x4 <8 of the four troops?
7. In how many ways can one arrange all of the letters in the 19. If Zachary rolls a fair die five times, what is the probability
word INFORMATION so that no pair of consecutive letters oc- that the sum of his five rolls is 20?
curs more than once? [Here we want to count arrangements such 20. Ata 12-week conference in mathematics, Sharon met seven
as INNOOFRMTA and FORTMAIINON but not INFORIN- of her friends from college. During the conference she met each
MOTA (where “IN” occurs twice) or NORTFNOIAMI (where friend at lunch 35 times, every pair of them 16 times, every trio
“NO” occurs twice).] eight times, every foursome four times, each set of five twice,
8. Determine the number of integer solutions to x; + x. + and each set of six once, but never all seven at once. If she had
x3 +x4 = 19 where —5 < x, < 10 forall 1 <i <4. lunch every day during the 84 days of the conference, did she
9. Determine the number of positive integers x where x <
ever have lunch alone?
9,999,999 and the sum of the digits in x equals 31. 21. Compute @() for n equal to (a) 51; (b) 420; (c) 12300.
10. Professor Bailey has just completed writing the final ex- 22. Compute ¢(”) for m equal to (a) 5186; (b) 5187; (c) 5188.
amination for his course in advanced engineering mathematics. 23. Let n € Z*. (a) Determine @(2”). (b) Determine ¢(2" p),
This examination has 12 questions, whose total value is to be where p is an odd prime.
200 points. In how many ways can Professor Bailey assign the
24. For which n € Z* is @(n) odd?
200 points if each question must count for at least 10, but not
more than 25, points and the point value for each question is to 25. How many positive integers n less than 6000 (a) satisfy
be a multiple of 5? gcd(n, 6000) = 1? (b) share a common prime divisor with
6000?
11. At Flo’s Flower Shop, Flo wants to arrange 15 different
plants on five shelves for a window display. In how many ways 26. If m,n € Z*, prove that p(n") = n™—'d(n).
can she arrange them so that each shelf has at least one, but no 27. Find three values for n € Z* where @(n) = 16.
more than four, plants? 28. For which positive integers n is @(n) a power of 2?
12. In how many ways can Troy select nine marbles from a bag 29. For which positive integers n does 4 divide ¢(n)?
of twelve (identical except for color), where three are red, three
30. At an upcoming family reunion, five families — each con-
blue, three white, and three green?
sisting of a husband, wife, and one child —are to be seated
13. Find the number of permutations of a, b, c,...,*, y, Z,in around a circular table. In how many ways can these 15 people
which none of the patterns spin, game, path, or net occurs. be arranged around the table so that no family is seated all
14. Answer the question in Example 8.10 for the case of six together? (Here, as in Example 8.9, two seating arrangements
villages. are considered the same if one is a rotation of the other.)
8.2
Generalizations of the Principle
Consider a set S with |S| = N, and conditions ¢), ¢2, . ., C; Satisfied by some of the
elements of S. In Section 8.1 we saw how the Principle of Inclusion and Exclusion provides
a way to determine N(c;C2--+-C;,), the number of elements in S that satisfy none of the r
conditions. If m € Z* and 1 < m <t, we now want to determine E,,, which denotes the
398 Chapter 8 The Principle of Inclusion and Exclusion
number of elements in S that satisfy exactly m of the t conditions. (At present we can obtain
Eo.)
We can write formulas such as
Ey = N(e,02€3 -- + €;) + N(e:0203 - + Cr) + + N(E162€3 «+ C-1¢r),
and
Ey = N(e10203 +++ Cr) + N(e1C2c3 ++ Cr) Fs FN (610203 + > + Cr-2C1~-1€r),
and although these results do not assist us as much as we should like, they will be a useful
starting place as we examine the Venn diagrams for the cases where t = 3 and 4.
For Fig. 8.4, where t = 3, we place a numbered condition beside the circle representing
those elements of S satisfying that particular condition and we also number each of the
individual regions shown. Then £, equals the number of elements in regions 2, 3, and 4.
But we can also write
E, = N(c1) + N(c2) + N(c3) — 2 [N(cic2) + N (e103) + N(c2¢3)] + 3N (c1€2€3).
In N(c;) + N(c2) + N(c3) we count the elements in regions 5, 6, and 7 twice and those in
region 8 three times. In the next term, the elements in regions 5, 6, and 7 are deleted twice.
We remove the elements in region 8 six times in 2 [N(c;c2) + N(cyc3) + N(c2¢3)], so we
then add on the term 3N(c,c2¢3) and end up not counting the elements in region 8 at all.
Hence we have £, = S; — 282 + 383 = 8, — (7) S2 + (5) S3.
PAY
C2
\ C3
Figure 8.4
When we turn to £>, our earlier formula indicates that we want to count the elements of
S in regions 5, 6, and 7. From the Venn diagram,
Ex = N(cyc2) + N(cye3) + N(e2e3) — 3N (c1c2¢3) = Sz — 383 = Sz — (7) $3.
and
Ex = N(c\¢2¢3) = $3.
In Fig. 8.5, the conditions c;, ¢2, cz are associated with circular subsets of $, whereas cq is
paired with the rather irregularly shaped area made up of regions 4, 8, 9, 11, 12, 13, 14, and
16. For each 1 <i <4, E; is determined as follows:
8.2 Generalizations of the Principle 399
F| [regions 2, 3, 4, 5]:
E, =[N(c1) + N(c2) + N(c3) + N(ce4)]
— 2[N(e)e2) + N(c1c3) + N(eye4) + N (e203) + N(e2e4) + N(c30¢4)]
+3[N(cice2c3) + N(cye2ce4) + N(c1¢3¢4) + N(c203¢4)]
aa 4N (c1020€3C4)
= S$, — 2S) + 383 — 484 = S, — (7) S2 + (3) S3
— (3) Ss.
Note: Taking an element in region 3, we find that it is counted once in £; and once in S,
[in N(c3)]. Taking an element in region 6, we find that it is not counted in £); it is counted
twice in S$; [in both N(c2) and N(c3)] but removed twice in 28> [for it is counted once in $>
in N(c2c3)], so overall it is not counted. The reader should now consider an element from
region 12 and one from region 16 and show that each contributes a count of 0 to both sides
of the formula for £.
C4
: M27
(t= 4)
Figure 8.5
E> [regions 6-11]:
From Fig. 8.5, £2 = $2 — 383 + 6$4 = S$. — (7) S3 + (5) Sa. For details on this formula
we examine the results in Table 8.1, where next to each summand of 5$>, $3, and S, we
list the regions whose elements are counted in determining that particular summand. In
calculating S; — 3S; + 6S, we find the elements in regions 6-11, which are precisely those
that are to be counted for F>.
Table 8.1
S2 S3 S4
N(c1¢2): 7, 13, 15, 16 N(c,c2c3): 15, 16 N(e1¢2¢€3¢4): 16
N(c\¢3): 10, 14, 15, 16 N(c)c2c4): 13, 16
N(c,c4): 11, 13, 14, 16 N(c,c3c4): 14, 16
N(e2¢3): 6, 12, 15, 16 N(e2€3€4): 12, 16
N(e2c4): 8, 12, 13, 16
N (e304): 9, 12, 14, 16
400 Chapter 8 The Principle of Inclusion and Exclusion
Finally, the elements for £3 are found in regions 12-15, and £3 = $3 — 4S4 = $3 —
({)S4; the elements for E4 are those in region 16, and E4 = Sa.
These results suggest the following theorem.
THEOREM 8.2 Under the hypotheses of Theorem 8.1, foreach 1 < m < ft, the number of elements in S that
satisfy exactly m of the conditions c), c2,.... ¢; 18 given by
+1 m+2 _ t
Em = Sin — (” ) Sma + ( ) Snaa —rett (—1)' "( Js. (1)
] 2 t—m
(If m = 0, we obtain Theorem 8.1.)
Proof: Arguing as in Theorem 8.1, let x € S and consider the following three cases.
a) When x satisfies fewer than m conditions, it contributes a count of 0 to each of the
terms Em, Sm. Sm+1..--, St, SO it is not counted on either side of the equation.
b) When x satisfies exactly m of the conditions, it is counted once in £,, and once in S,,,
but not in S,,41,.... S;. Consequently, it is included once in the count for either side
of the equation.
c) Suppose x satisfies r of the conditions, where m <r <t. Then x contributes nothing
to Em. Yet it is counted (7) times in Sn, (,,”, ,) times in S,,41,..., and (7) times in
S,, but 0 times for any term beyond S,. So on the right-hand side of the equation, x is
counted (/) — ("F\q'cs)+ (°S2) Gl 2) — 2 + DGLy) (0) times
ForO<k<r-—m,
("2 *)( r )-“2 r}
k Voniea ktm! (mm +kir —m —k)!
r! | r! (r —m)!
mt ki(r—m—k! mi(r—m)! kr —m—b!
(nl a")
r\{r—m
Consequently, on the right-hand side of Eq. (1), x is counted
ro") Oa)"
9) (eo)
(MUS)
32) te)
m 0 m 1 m 2 m}\r—m
m 0 ] 2 r—m
= (7) n= (")-0=osimes,
m m
and the formula is verified.
Based on this result, if L,, denotes the number of elements of S (under the hypotheses of
Theorem 8.1) that satisfy at least m of the t conditions, then we have the following formula.
COROLLARY 8.2 Lm = Sm ~ a 1) Sm + (no 1) Sin2 see (—1)™ (/1)S:-
Proof: A proof is outlined in the exercises at the end of this section.
8.2 Generalizations of the Principle 401
When mm = 1, the result in Corollary 8.2 becomes
1 2 aft-1
Li=s-( Sot p/m HED > }S
= §, —S.4+ 83 -—---+(-))''S,.
Comparing this with the result in Theorem 8.1, we find that
L,=N-N=|S|—-N.
This result is not much of a surprise, because an element x of S is counted in L if it satisfies
at least one of the conditions c), c2, c3, ... , Cc; —that is, if x € S and x is not counted in
N= N(€1€2€3 .7 Cr).
Looking back to Example 8.10, we shall find the numbers of systems of two-way roads so
EXAMPLE 8.11
that exactly (£2) and at least (12) two of the villages remain isolated.
The previously calculated results for this example show
Ex = Sy — (7)S3 + (3)S4 — G)Ss = 80 — 3(20) + 6(5) — 10(1) = 40,
Ly = Sp — (7)S3 + Gf) Sa — (7)Ss = 80 — 2(20) + 3(5) — 4(1) = 51.
name cards at the ten places at her table and then leaves to run a
last-minute errand. Her husband, Herbert, comes home from his
morning tennis match and unfortunately leaves the back door
1. For the situation in Examples 8.10 and 8.11 compute £, for
open. A gust of wind scatters the ten name cards. In how many
0 <i <5 and show that }°°_, E, = N = |S|. ways can Herbert replace the ten cards at the places at the ta-
2. a) In how many ways can the letters in ARRANGEMENT ble so that exactly four of the ten women will be seated where
be arranged so that there are exactly two pairs of consecutive Zelma had wanted them? In how many ways will at least four
identical letters? at least two pairs of consecutive identical of them be seated where they were supposed to be?
letters? 7. If 13 cards are dealt from a standard deck of 52, what is
b) Answer part (a), replacing two with three. the probability that these 13 cards include (a) at least one card
3. In how many ways can one arrange the letters in CORRE- from each suit? (b) exactly one void (for example, no clubs)?
SPONDENTS so that (a) there is no pair of consecutive identi- (c) exactly two voids?
cal letters? (b) there are exactly two pairs of consecutive 8. The following provides an outline for proving Corollary 8.2.
identical letters? (c) there are at least three pairs of consecu- Fill in the needed details.
tive identical letters? a) First note that EF, = L, = S,.
4. Let A = {1,2,3,..., 10}, and B = {1,2,3,..., 7}. How
b) What is £,_,, and how are L, and L,_, related?
many functions f: A — B satisfy | f(A)| = 4? How many have
| f(A)| < 4?
c) Show that L,-) = 8,1 — ((=3)S;.
d) For all 1<m<t-—41, how are L,,, Lm4j, and E,
5. In how many ways can one distribute ten distinct prizes
among four students with exactly two students getting nothing? related?
How many ways have at least two students getting nothing? e) Using the results in steps (a) through (d), establish the
corollary by a backward type of induction.
6. Zelma is having a luncheon for herself and nine of the women
in her tennis league. On the morning of the luncheon she places
402 Chapter 8 The Principle of Inclusion and Exclusion
8.3
Derangements: Nothing
Is in Its Right Place
In elementary calculus the Maclaurin series for the exponential function is given by
x2 3 Sx"
e Sltxt
2!Stat oe
sO
To five places, e~! = 0.36788 and 1 — 1 + (1/2!) — (1/3!) +--- — (1/7!) = 0.36786.
Consequently, for all k € Z*, ifk > 7, then }°*_)((—1)”)/n! is a very good approximation
toe,
We find these ideas helpful in working some of the following examples.
While at the racetrack, Ralph bets on each of the ten horses in a race to come in according
EXAMPLE 8.12
to how they are favored. In how many ways can they reach the finish line so that he loses
all of his bets?
Removing the words horses and racetrack from the problem, we really want to know
in how many ways we can arrange the numbers 1, 2, 3,..., 10 so that 1 is not in first
place (its natural position), 2 is not in second place (its natural position), ..., and 10 is
not in tenth place (its natural position). These arrangements are called the derangements of
1,2,3 , 10.
The Principle of Inclusion and Exclusion provides the key to calculating the number
of derangements. For each 1 <i < 10, an arrangement of 1, 2, 3, ..., 10 is said to satisfy
condition c; if integer 7 is in the ith place. We obtain the number of derangements, denoted
by dio, as follows:
diy = N(\€2€3 +++ Fi9) = 10! — ('?)9! + (P)8! - (3)7! + +++ + (10)0!
= 10![1 ~ ('P)(9!/10 + (2)(8!/10! ~ (2\TYA0! +--+ + (79) 0!/10!)]
= 10!1 —14+ (1/2!) — 4/3) +: -+ (1/10!)] = (10!)(e7!).
The sample space here consists of the 10! ways the horses can finish. So the probability
that Ralph will lose every bet is approximately (10!)(e~!)/(10!) = e7!. This probability
remains (more or less) the same if the number of horses in the race is 11, 12,.... On the
other hand, for n horses, where n > 10, the probability that our gambler wins at least one
of his bets is approximately 1 ~ e~' = 0.63212.
The number of derangements of 1, 2, 3, 4 is
EXAMPLE 8.13
dy = 41 —14+ (1/2) — 1/3) + 1/49]
= 4t[(1/2!) — (1/3!) + (1/49] = (4)(3) —441=9.
These nine derangements are
8.3 Derangements: Nothing Is in Its Right Place 403
2143 3142 4123
2341 3412 4312
2413 3421 4321.
Among the 24 — 9 = 15 permutations of 1, 2, 3, 4 that are nor derangements one finds 1234,
2314, 3241, 1342, 2431, and 2314.
Peggy has seven books to review for the C-H Company, so she hires seven people to review
EXAMPLE 8.14
them. She wants two reviews per book, so the first week she gives each person one book
to read and then redistributes the books at the start of the second week. In how many ways
can she make these two distributions so that she gets two reviews (by different people) of
each book?
She can distribute the books in 7! ways the first week. Numbering both the books and the
reviewers (for the first week) as 1, 2,..., 7, for the second distribution she must arrange
these numbers so that none of them is in its natural position. This she can do in d7 ways.
By the rule of product, she can make the two distributions in (7!)d7 = (7!)7(e7!) ways.
hopes to be finished in time to leave by 9:50 A.M. for another
EXERCISES 8.3 appointment. What is the probability that Regina will be able
to leave on time?
1. In how many ways can the integers 1, 2, 3,..., 10 be ar-
ranged in a line so that no even integer is in its natural position? 9. In how many ways can Mrs. Ford distribute ten distinct
2. a) List all the derangements of 1, 2, 3, 4, 5 where the first books to her ten children (one book to each child) and then
three numbers are 1, 2, and 3, in some order. collect and redistribute the books so that each child has the
opportunity to peruse two different books?
b) List all the derangements of 1, 2, 3, 4, 5, 6 where the
first three numbers are 1, 2, and 3, in some order. 10. a) When» balls, numbered 1, 2, 3, ..., 1 are taken in suc-
3. How many derangements are there for 1, 2, 3, 4, 5? cession from a container, a rencontre occurs if the mth ball
withdrawn is numbered m, for some 1 < m <n. Find the
4. How many permutations of 1, 2, 3, 4, 5, 6, 7 are not de-
probability of getting (i) no rencontres; (ii) (exactly) one
rangements?
rencontre, (iii) at least one rencontre; and (iv) r rencontres,
5. a) Let A = {1, 2,3,..., 7}. Afunction f: A > A is said where 1 <r <n.
to have a fixed point if for some x € A, f(x) = x. How
b) Approximate the answers to the questions in part (a).
many one-to-one functions f: A — A have at least one
fixed point? 11. Ten women attend a business luncheon. Each woman
checks her coat and attaché case. Upon leaving, each woman is
b) In how many ways can we devise a secret code by as-
given a coat and case at random. (a) In how many ways can the
signing to each letter of the alphabet a different letter to
coats and cases be distributed so that no woman gets either of
represent it?
her possessions? (b) In how many ways can they be distributed
6. How many derangements of 1, 2, 3, 4, 5, 6, 7, 8 start with so that no woman gets back both of her possessions?
(a) 1, 2, 3, and 4, in some order? (b) 5, 6, 7, and 8, in some
order? 12. Ms. Pezzulo teaches geometry and then biology to a class
7. For the positive integers 1,2, 3,...,4 —1,”, there are
of 12 advanced students in a classroom that has only 12 desks.
11,660 derangements where 1, 2, 3, 4, and 5 appear in the first In how many ways can she assign the students to these desks so
that (a) no student is seated at the same desk for both classes?
five positions. What is the value of n?
(b) there are exactly six students each of whom occupies the
8. Four applicants for a job are to be interviewed for 30 min- same desk for both classes?
utes each: 15 minutes with each of supervisors Nancy and
Yolanda. (The interviews are in separate rooms, and inter- 13. Give acombinatorial argument to verify that for alln € Z*,
viewing starts at 9:00 A.M.) (a) In how many ways can these
interviews be scheduled during a one-hour period? (b) One nt = (())as + (a + (3 Ja feet ("a = > (ja
applicant, named Josephine, arrives at 9:00 A.M. What is the
probability that she will have her two interviews one after the (For each | <k <n, d, = the number of derangements of 1,
other? (c) Regina, another applicant, arrives at 9:00 a.m. and 2,3,...,k:dy
= 1)
404 Chapter 8 The Principle of Inclusion and Exclusion
14, a) In how many ways can the integers 1,2, 3,..., n be 15. Answer part (a) of Exercise 14 if the numbers are arranged
arranged in a line so that none of the patterns 12, 23, in a circle, and, as we count clockwise about the circle, none of
34,..., (2 — 1)n occurs? the patterns 12, 23, 34,..., (7 — 1), n1 occurs.
b) Show that the result in part (a) equals d,_; + dy. 16. What is the probability that the gambler in Example 8.12
(d, = the number of derangements of 1, 2, 3,..., 7.) wins (a) (exactly) five of his bets? (b) at least five of his bets?
8.4
Rook Polynomials
Consider the six-square “chessboard” shown in Fig. 8.6 (Note: The shaded squares are not
part of the chessboard.). In chess a piece called a rook or castle is allowed at one turn to
be moved horizontally or vertically over as many unoccupied spaces as one wishes. Here
a rook in square 3 of the figure could be moved in one turn to squares 1, 2, or 4. A rook at
square 5 could be moved to square 6 or square 2 (even though there is no square between
squares 5 and 2).
For k € Z* we want to determine the number of ways in which k rooks can be placed on
the unshaded squares of this chessboard so that no two of them can take each other — that
is, no two of them are in the same row or column of the chessboard. This number is denoted
by rz, or by r,(C) if we wish to stress that we are working on a particular chessboard C.
For any chessboard, r; is the number of squares on the board. Here r; = 6. Two nontaking
rooks can be placed at the following pairs of positions: {1,4}, {1,5}, {2,4}, {2, 6}, {3,5},
{3, 6}, {4, 5}, and (4, 6}, so r2 = 8. Continuing, we find that r3 = 2, using the locations
{1,4,5} and {2, 4, 6}; 7, = 0, fork > 4.
5
With ro = 1, the rook polynomial, r(C, x), for the chessboard in Fig. 8.6 is defined as
Figure 8.6 r(C, x) = 1+ 6x + 8x? + 2x?. For each k > 0, the coefficient of x* is the number of ways
we can place k nontaking rooks on chessboard C.
What we have done here (using a case-by-case analysis) soon proves tedious. As the size
of the board increases, we have to consider cases wherein numbers such as r4 and rs are
nonzero. Consequently, we shall now make some observations that will allow us to make
use of small boards and somehow break up a large board into smaller subboards.
The chessboard C in Fig. 8.7 is made up of 11 unshaded squares. We note that C consists
of a 2 X 2 subboard C, located in the upper left corner and a seven-square subboard C,
located in the lower right corner. These subboards are disjoint because they have no squares
in the same row or column of C.
Calculating as we did for our first chessboard, here we find
r(Cy, x) = 1+
4x + 2x”, r(Co, x) = 14+ 7x + 10x?
+ 2x3,
r(C, x) = 1+ 11x + 40x? + 56x3 + 28x47 +.4x° =r (Cy, x) - (Co, x).
Figure 8.7 Hence r(C, x) =r(C), x) + r(Co, x). But did this occur by luck or is something happen-
ing here that we should examine more closely? For example, to obtain r3 for C, we need to
know in how many ways three nontaking rooks can be placed on board C. These fall into
three cases:
a) All three rooks are on subboard C2 (and none is on C;): (2)(1) = 2 ways.
b) Two rooks are on subboard C2 and one is on C,: (10)(4) = 40 ways.
c) One rook is on subboard C2 and two are on C): (7)(2) = 14 ways.
8.4 Rook Polynomials 405
Consequently, three nontaking rooks can be placed on board C in (2)(1) + (10)(4) +
(7)(2) = 56 ways. Here we see that 56 arises just as the coefficient of x* does in the product
r(Cj, x) . r(Co, Xx).
In general, if C is a chessboard made up of pairwise disjoint subboards C;, C2,..., Cn,
then r(C, x) = r(Cy, x)r(Co, x) ---r(Cy, x).
_
The last result for this section demonstrates the type of principle we have seen in other
results in combinatorial and discrete mathematics: Given a large chessboard, break it into
smaller subboards whose rook polynomials can be determined by inspection.
(a) (b) ()
Figure 8.8
Consider chessboard C in Fig. 8.8(a). For k > 1, suppose we wish to place k nontak-
ing rooks on C. For each square of C, such as the one designated by («), there are two
possibilities to examine.
a) Place one rook on the designated square. Then we remove, as possible locations for the
other k — 1 rooks, all other squares of C in the same row or column as the designated
square. We use C, to denote the remaining smaller subboard [seen in Fig. 8.8(b)].
b) We do not use the designated square at all. The k rooks are placed on the subboard C,
[C with the one designated square eliminated — as shown in Fig. 8.8(c)].
Since these two cases are all-inclusive and mutually disjoint,
re(C) = re-1(Cs) + 1x (Ce).
From this we see that
re(C)x* = rea(Cs)x* + ri (Ce)x*. (1)
If n is the number of squares in the chessboard (here n is 8), then Eq. (1) is valid for all
1<k <n, and we write
n #
> re(C)x* = > re-1(Cy)x* + > re (Ce)x*. (2)
k=] k=1 k=]
For Eq. (2) we realize that the summations may stop before k = n, We have seen cases, as
in Fig. 8.6, where r, and some prior r;’s are 0. The summations start at k = 1, for otherwise
we could find ourselves with the term r_;(C,)x° in the first summand on the right-hand
side of Eq. (2).
406 Chapter 8 The Principle of Inclusion and Exclusion
Equation (2) may be rewritten as
So re(C)x* = x Yo re (Cy)! + YO eC)x" (3)
k=] k=1 k=]
or
1+ So re(C)x* = x -r(Cy, x) + DO (Ce)x* + 1,
k=] k=]
from which it follows that
r(C, x) =x-r(Cs, x) +r (Ce, X). (4)
We now use this final equation to determine the rook polynomial for the chessboard
shown in part (a) of Fig. 8.8. Each time the idea in Eq. (4) is used, we mark the special
square we are using with («). Parentheses are placed about each chessboard to denote the
rook polynomial of the board.
© @ ®
(> Bl [a - &
B)B- [Calle
= x7(1 + 2x) + 2x(1 + 4x + 2x7) 4+ x(1 + 3x 4+ x”)
*(G)+
= 3x + 12x24 7x3 4+-x(1 + 2x) 4+ (1 4+ 4x + 2x7) = 14 8x + 16x? + 7x7.
(e )
8.5
Arrangements with Forbidden Positions
The rook polynomials of the previous section seem interesting on their own. Now we shall
find them useful in solving the following problems.
In making seating arrangements for their son’s wedding reception, Grace and Nick are
EXAMPLE 8.15
down to four relatives, denoted R;, for 1 <i <4, who do not get along with one another.
There is a single open seat at each of the five tables T;, where 1 < 7 <5. Because of family
differences,
a) R, will not sit at T; or To. b) R> will not sit at T>.
c) R3 will not sit at T3 or Ty. d) R, will not sit at Ty or Ts.
8.5 Arrangements with Forbidden Positions 407
This situation is represented in Fig. 8.9. The number of ways we can seat these four
people at four different tables, and satisfy conditions (a) through (d), is the number of ways
four nontaking rooks can be placed on the chessboard made up of the unshaded squares.
However, since there are only seven shaded squares, as opposed to thirteen unshaded ones,
it would be easier to work with the shaded chessboard.
Ty Tz T3 Tq Ts
Figure 8.9
We start with the conditions that are required for us to apply the Principle of Inclusion and
Exclusion: For each 1 <i < 4, let c; be the condition where a seating assignment of these
four people (at different tables) is made with relative R; in a forbidden (shaded) position.
As usual, |.S| denotes the total number of ways we can place the four relatives, one to a
table. Then |$| = N = Sp =S!
To determine S; we consider each of the following:
e N(c,) = 4!+4!, for there are 4! ways to seat Ro, R3, and Ry if R; is in forbidden
position T; and another 4! ways if Ry is at table T;, his or her other forbidden position.
e N(c2) = 4!, for after placing R2 at forbidden table T2, we must place R,, R3, and Ry
at T,, T3, T4, and Ts, one person to a table.
e N(c3) = 4! + 4!, one summand for R3 being in forbidden position T3, and the other
summand for R3 being in the forbidden position Ty.
@ N(c4) = 4!+4+ 4!, each of the two summands arising when Ry, is placed at each of the
two forbidden positions T, and Ts.
Hence S$; = 7(4!).
Turning to S we have these considerations:
@ N(c\c2) = 3!, because after we place R; at T) and R> at T>, there are three tables
(T3, Ty, and T;) where R3 and Ry can be seated.
@ N(cic3) = 3! + 3! + 3! + 3!, because there are four cases where R; and R3 are located
at forbidden positions:
i) R, at T,; R3 at T; ii) R, at To; R3 at T;
iii) R, at T,; R3 at Ty iv) R; at T2; R3 at Ty.
In a similar manner we find that N(c)cq4) = 4(3!), N(c20e3) = 2(3!), N(e2c4) = 2(3)),
and N(c3c4) = 3(3!). Consequently, S. = 16(3!).
Before continuing, we make a few observations about S; and S$. For S; we have
7(4!) = 7(5 — 1)!, where 7 is the number of shaded squares in Fig. 8.9. Also, Sy = 16(3!) =
16(5 — 2)!, where 16 is the number of ways two nontaking rooks can be placed on the
shaded chessboard.
In general, for all 0 <i <4, S$; =1r,(5 —7)!, where r; is the number of ways in which it
is possible to place i nontaking rooks on the shaded chessboard shown in Fig. 8.9.
408 Chapter 8 The Principle of Inclusion and Exclusion
Consequently, to expedite the solution of this problem, we turn to r(C, x), the rook
polynomial of this shaded chessboard. Using the decomposition of C into the disjoint
subboards in the upper left and lower right corners, we find that
r(C,x) = (1 +3x4+x°)(1 + 4x4 3x7) = 14+ 7x + 16x? + 13x? + 3x4,
SO
N(€\€2€3C4) = Sy — Sy + Sy — 83 + Sy = 5! — 7(4') + 163!) — 13(2!) + 31)
4
= YiEvind ~i)!=25.
i=0
Grace and Nick can breathe a sigh of relief. There are 25 ways in which they can seat
these last four relatives at the reception and avoid any squabbling.
The next example demonstrates how a bit of rearranging of our chessboard can help in
our calculations.
We have a pair of dice; one is red, the other green. We roll these dice six times. What is the
EXAMPLE 8.16
probability that we obtain all six values on both the red die and the green die if we know
that the ordered pairs (1, 2), (2, 1), (2, 5), (3, 4), (4, 1), (4, 5), and (6, 6) did not occur?
[Here an ordered pair (a, b) indicates a on the red die and b on the green.]
Recognizing this problem as one dealing with permutations and forbidden positions,
we construct the chessboard shown in Fig. 8.10(a), where the row labels represent the
outcome on the red die, the column labels the outcome on the green die, and the shaded
squares constitute the forbidden positions. In this figure the shaded squares are scattered.
Relabeling the rows and columns, we can redraw the chessboard as shown in Fig. 8.10(b),
where we have taken shaded squares in the same row (or column) of the board shown in
part (a) and made them adjacent. In Fig. 8.10(b), the chessboard C (of seven shaded squares)
is the union of four pairwise disjoint subboards, and so
r(C, x) = 1 +4x +2x7)(1 +x)? = 1 47x + 17x? + 19x37 4+ 10x4 + 229,
1 2 3 4 5 6 1 5 3 4 2 6
1 1
2 2
3 4
4 3
5 5
6 6
(a) (b)
Figure 8.10
For each 1 <i < 6, define c; as the condition where, having rolled the dice six times,
we find that all six values occur on both the red die and the green die, but i on the red die
8.5 Arrangements with Forbidden Positions 409
is paired with one of the forbidden numbers on the green die. [Note that N(cs) = 0.] Then
the number of (ordered) sequences of the six rolls of the dice for the event we are interested
in is
i=0 i=0
= 6![6! — 7(5!) + 17(4!) — 193) + 10@!) — 20) + 0(0)]
= 6![192] = 138,240.
Since the sample space consists of all sequences of six ordered pairs selected with
repetition from the 29 unshaded squares of the chessboard, the probability of this event is
138,240/(29)® = 0.00023.
Our last example provides a unifying idea for what we have done in this section.
Let A = {1, 2, 3, 4}andB = {u, v, w, x, y, z}. How many one-to-one functions f: A > B
EXAMPLE 8.17
satisfy none of the following conditions:
cy: fl) =uore cz: f(2)
= w c3: f3) =worx ca: f(4) =x, y, orz
As in our two prior examples, we construct a chessboard, as shown in Fig. 8.11. Here
we are really interested in the chessboard C made up of the eight shaded squares (which
comprise two disjoint subboards). Now
r(C, x) = (1 +2x)(1 + 6x + 9x? + 2x3) = 14 8x + 21x? + 20x07 + 4x7.
So
N (€:02€3C4) = Sy — S; + So — S3 + Sy
= (61/2!) — 8(5!/2!) + 21(4!/2") — 20(31/2!) + 4(2!/2!)
4
= SOE-Dir6 —1!/2! = 76
i=0
and there are 76 one-to-one functions f: A > B where none of the conditions c¢;, ¢2, ¢3,
c4 is satisfied.
1
2
3
4
Figure 8.11
Even more so, look back at N (€;€2€3C4) in Example 8.15. Disregarding the vocabulary of
the “relatives” and “tables,” we realize that we are counting the number of one-to-one func-
tions g: {R;, Ro, R3, Ra} > {T), To, T3, Ts, Ts} where none of the conditions c1, ¢2, ¢3, C4
410 Chapter 8 The Principle of Inclusion and Exclusion
Finally, for A = {1, 2, 3, 4, 5, 6, 7, 8}, suppose we want to count the number of
one-to-one functions : A > A where h(i) # i for all i € A. Here the rook polynomial
would be
8
r(C, x)= (+x) = > (;)*
k=0
(el yrs ()e-Oe -(
and we find that the number of such one-to-one functions / is
1 !
=afii4 — a te +a8}
= dg, the number of derangements of 1, 2,3,..., 8.
D(a Gh AS ee ER
|
1. Verify directly the rook polynomials for (a) the unshaded
chessboards in Figs. 8.7 and 8.8(a), and (b) the shaded chess-
boards in Figs. 8.9 and 8.10(b).
2. Construct or describe a smallest (least number of squares) (1) (11)
chessboard for which rig 4 0.
3. a) Find the rook polynomial for the standard 8 X 8 chess-
board.
b) Answer part (a) with 8 replaced by n, forn € Z”.
4, Find the rook polynomials for the shaded chessboards in
Fig. 8.12. (itl) (1v)
Figure 8.13
Cy: C3:
and Charles both dislike SQL, Sandra wants to avoid C++ and
VHDL. Paul detests Java and C++, and Todd refuses to work
in SQL and Perl. In how many ways can Professor Ruth assign
each grader to correct programs in one language, cover all five
Figure 8.12 languages, and keep everyone content?
8. Why do we have 6! in the term (6!)N (€)c) - - +) for the
5. a) Find the rook polynomials for the shaded chessboards solution of Example 8.16?
in Fig. 8.13.
9, Five professors named Al, Violet, Lynn, Jack, and Mary Lou
b) Generalize the chessboard (and rook polynomial) for are to be assigned to teach one class each from among calcu-
Fig. 8.13(). lus I, calculus I, calculus Il, statistics, and combinatorics. Al
6. a) Let C be a chessboard that has m rows and n columns, will not teach calculus II or combinatorics, Lynn cannot stand
with m <n (for a total of mn squares). For 0 <k <m, in statistics, Violet and Mary Lou both refuse to teach calculus f
how many ways can we arrange k (identical) nontaking or calculus III, and Jack detests calculus II.
rooks on C? a) In how many ways can the head of the mathematics de-
b) For the chessboard C in part (a), determine the rook partment assign each of these professors one of these five
polynomial r(C, x). courses and still keep peace in the department?
7. Professor Ruth has five graders to correct programs in her b) For the assignments in part (a), what is the probability
courses in Java, C++, SQL, Perl, and VHDL. Graders Jeanne that Violet will get to teach combinatorics?
8.6 Summary and Historical Review 4il
10. A pair of dice, one red and the other green, is rolled six @ Woman 2 would not be compatible with man 2 or 4.
times. We know that the ordered pairs (1, 1), (1, 5), (2. 4),
@ Woman 3 would not be compatible with man 3 or 6.
(3, 6), (4, 2), (4, 4), (5. 1), and (5, 5) did not come up. What is
the probability that every value came up on both the red die @ Woman 4 would not be compatible with man 4 or 5.
and the green one? In how many ways can the service successfully match each
11. A computer dating service wants to match each of four of the four women with a compatible partner?
women with one of six men. According to the information these 12. For A = {1, 2, 3, 4, 5} and B = {u, v, w, x, y, z}, deter-
applicants provided when they joined the service, we can draw mine the number of one-to-one functions f: A— B where
the following conclusions. fC) #v,w; f(2) Au, w; FG) # x: and f(4) Fv, x, y.
@ Woman | would not be compatible with man 1, 3, or 6.
8.6
Summary and Historical Review
In the first and third chapters of this text we were concerned with enumeration problems
in which we had to be careful of situations wherein arrangements or selections were over-
counted. This situation became even more involved in Chapter 5 when we tried to count
the number of onto functions for two finite sets.
With Venn diagrams to lead the way, in this chapter we obtained a pattern called the
Principle of Inclusion and Exclusion. Using this principle, we restated each problem in terms
of conditions and subsets. Using enumeration formulas on permutations and combinations
that were developed earlier, we solved some simpler subproblems and let the principle
manage our concern about overcounting. As a result, we were able to solve a variety of
problems, some dealing with number theory and one with graph theory. We also proved the
formula conjectured earlier in Section 5.3 for the number of onto functions for two finite
sets.
This principle has an interesting history, being found in different manuscripts under such
names as the “Sieve Method” or the “Principle of Cross Classification.” A set-theoretic
version of the principle, which concerned itself with set unions and intersections, is found
in Doctrine of Chances (1718), a text on probability theory by Abraham DeMoivre (1667—
1754). Somewhat earlier, in 1708, Pierre Rémond de Montmort (1678-1719) used the idea
behind the principle in his solution of the problem generally known as le probléme des
rencontres (matches). (In this old French card game the 52 cards in a first deck are arranged
face up in a row — perhaps on a table. Then the 52 cards of a second deck are dealt, with
one new card being placed on each of the 52 cards previously arranged on the table top.
The score for the game is determined by counting the resulting matches, where both the
suit and the face value for each of the two cards must match.)
Credit for the way we developed and dealt with the Principle of Inclusion and Exclusion
belongs to James Joseph Sylvester (1814-1897). (This colorful English-born mathemati-
cian also made major contributions in the theory of equations; the theory of matrices and
determinants; and invariant theory, which he founded with Arthur Cayley (1821-1895).
In addition Sylvester founded the American Journal of Mathematics, the first American
journal established for mathematical research.) The importance of the inclusion-exclusion
technique was not generally appreciated, however, until somewhat later, when the publica-
tion Choice and Chance by W. A. Whitworth [10] made mathematicians more aware of its
potential and use.
412 Chapter 8 The Principle of Inclusion and Exclusion
James Joseph Sylvester (1814-1897)
For more on the application of this principle, examine Chapter 4 of C. L. Liu [4], Chapter
2 of H. J. Ryser [8], or Chapter 8 of A. Tucker [9]. More number-theoretic results related
to the principle, including the Mébius inversion formula, can be found in Chapter 2 of
M. Hall [1], Chapter X of C. L. Liu [5], and Chapter 16 of G. H. Hardy and E. M. Wright
[3]. An extension of this formula is given in the article by G. C. Rota [7].
The article by D. Hanson, K. Seyffarth, and J. H. Weston [2] provides an interesting
generalization of the derangement problem discussed in Section 8.3. The ideas behind
the rook polynomials and their applications were developed in the late 1930s and dur-
ing the 1940s and 1950s. Additional materia! on this topic is found in Chapters 7 and 8 of
J. Riordan [6].
REFERENCES
. Hall, Marshall, Jr. Combinatorial Theory. Waltham, Mass.: Blaisdell, 1967.
. Hanson, Denis, Seyffarth, Karen, and Weston, J. Harley. “Matchings, Derangements, Rencon-
tres.” Mathematics Magazine 56, no. 4 (September 1983): pp. 224-229.
. Hardy, Godfrey Harold, and Wright, Edward Maitland. An Introduction to the Theory of Num-
bers, 5th ed. Oxford: Oxford University Press, 1979.
. Liu, C, L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
.Liu, C. L. Topics in Combinatorial Mathematics. Mathematical Association of Amer-
ica, 1972.
. Riordan, John. An Introduction to Combinatorial Analysis. Princeton, N.J.: Princeton Univer-
sity Press, 1980. (Originally published in 1958 by John Wiley & Sons.)
. Rota, Gian Carlo, “On the Foundations of Combinatorial Theory, I. Theory of Mébius Func-
tions.” Zeitschrift fiir Wahrscheinlichkeits Theorie 2 (1964): pp. 340-368.
. Ryser, Herbert J. Combinatorial Mathematics. Carus Mathematical Monograph, No. 14.
Published by the Mathematical Association of America, distributed by John Wiley & Sons,
New York, 1963.
. Tucker, Alan. Applied Combinatorics, 4th ed. New York: Wiley, 2002.
. Whitworth, William Allen. Choice and Chance. Originally published at Cambridge in 1867.
Reprint of the Sth ed. (1901), Hafner, New York, 1965.
Supplementary Exercises 413
9. If an arrangement of the letters in SURREPTITIOUS is
SUPPLEMENTARY EXERCISES selected at random, what is the probability that it contains
(a) (exactly) three pairs of consecutive identical letters? (b) at
most three pairs of consecutive identical letters?
1. Determine how many n € Z* satisfy n < 500 and are not 10. In how many ways can four w’s, four x’s, four y’s, and four
divisible by 2, 3, 5, 6, 8, or 10. z's be arranged so that there is no consecutive quadruple of the
2. How many integers n are such that 0 <n < 1,000,000 and same letter?
the sum of the digits in n is less than or equal to 37? 11. a) Given v distinct objects, in how many ways can we se-
3. At next week's church bazaar, Joseph and his cousin Jeffrey lect r of these objects so that each selection includes some
must arrange six baseballs, six footballs, six soccer balls, and particular m of the n objects? (Here m <r <n.)
six volleyballs on the four shel ves in the sports booth sponsored
b) Using the Principle of Inclusion and Exclusion, prove
by their Boy Scout troop. In how many ways can they do this that form <r<n7,
so that there are at least two, but no more than seven, balls on
each shelf? (Here all six balls for any one of the four sports are
identical in appearance.)
4. Find the number of positive integers n where 1 < n < 1000
Cr) Lore)
12. a) Let A € Z*. If we have A different colors available, in
and n is not a perfect square, cube, or fourth power. how many ways can we color the vertices of the graph
5. In how many ways can we arrange the integers |, 2, 3, shown in Fig. 8.14(a) so that no adjacent vertices share the
..., 8 ina line so that there are no occurrences of the patterns same color? This result in 4 is called the chromatic polyno-
12, 23,..., 78, 81? mial of the graph, and the smallest value of 4 for which the
value of this polynomial is positive is called the chromatic
6. a) If we have k different colors available, in how many
number of the graph, What is the chromatic number of this
ways can we paint the walls of a pentagonal room if adja-
graph? (We shall pursue this idea further in Chapter 11.)
cent walls are to be painted with different colors?
b) If there are six colors available, in how many ways can
b) What is the smallest value of k for which such a coloring
the rooms R,, 1 <i <5, shown in Fig. 8.14(b) be painted
is possible?
so that rooms with a common doorway, D,, 1 < j <5, are
7. Ten students take a physics test in a certain room. When painted with different colors?
the test is over the students take a break and then return to the
13. Find the number of ways to arrange the letters in LAPTOP
room to discuss their answers to the test questions. If there are
14 chairs in this room, in how many ways can the students seat so that none of the letters L, A, T, O is in its original position
and the letter P is not in the third or sixth position.
themselves after the break so that no one is in the same chair
he, or she, occupied during the test? 14. Forn € Z* prove that if @(n) = n — | then n is prime.
8. Using the result of Theorem 8.2, prove that the number of 15. Let Djs denote the set of positive divisors of 18. For d €
ways we can place s different objects in 7 distinct containers Dyg let Sy = {n|O<n <18 and ged(n, 18) = a}. (a) Show
with m containers each containing exactly r of the objects is that the collection S;, d € Djs, provides a partition of {1, 2,
3,4,..., 17, 18}. (b) Note that |S;| = 6 = @(18) and |$3| =
(-l"nlst < (—1)'(n —i)s-”
6 = (9). For each d € Djx, express |S,| in terms of Euler's
m! a
i=m
Gi —m)'(n — DMs —ir)(r!y" phi function.
(a)
Figure 8.14
414 Chapter 8 The Principle of Inclusion and Exclusion
16. For m € Z* let D,, = {d € Z*\d divides m}. For d € D,, ranged on four shelves in her office with all books on any one
let Sz = {n|O <n <m and gcd(n, m) = d}. (a) Show that the subject on its own shelf. When her office is cleaned, the 48
collection S;, d € D,,, provides a partition of {1, 2,3, 4,..., books are taken down and then replaced on the shelves— once
m — 1, m}. (b) Determine |S,| for each d € D,. again with all 12 books on any one subject on its own shelf.
17. If n € Z*, prove that (a) @(2n) = 26(n) when n is even; In how many ways can this be done so that (a) no subject is
and (b) @(2n) = @(n) when nvis odd. on its original shelf? (b) one subject is on its original shelf?
(c) no subject is on its original shelf and no book is in its orig-
18. Let a, b, c€ Z* with c = gcd(a, b). Prove that
inal position? [For example, the book originally in the third
b(ab)o(c) = b(a)d(b)ec. (from the left) position on the first shelf must not be replaced
19. Caitlyn has 48 different books: 12 each in mathematics, on the first shelf and must not be in the third (from the left)
chemistry, physics, and computer science. These books are ar- position on the shelf where it is placed.}
Generating
Functions
I: this chapter and the next, we continue our study of enumeration, introducing at this time
the important concept of the generating function.
The problem of making selections, with repetitions allowed, was studied in Chapter 1.
There we sought, for example, the number of integer solutions to the equation c; + cz +
c3 + c4 = 25 where c¢; > 0 for all 1 < i < 4. With the Principle of Inclusion and Exclusion,
in Chapter 8, we were able to solve a more restricted version of the problem, such as
Cy too +03 +4 = 25 with 0 < c; < 10 forall 1 <i <4. If, in addition, we wanted c to
be even and ¢3 to be a multiple of 3, we could apply the results of Chapters 1 and 8 to
several subcases.
The power of the generating function rests upon its ability not only to solve the kinds of
problems we have considered so far but also to aid us in new situations where additional
restrictions may be involved.
9.1
Introductory Examples
Instead of defining a generating function at this point, we shall examine some examples
that motivate the idea.
While shopping one Saturday, Mildred buys 12 oranges for her children, Grace, Mary, and
EXAMPLE 9.1 Frank. In how many ways can she distribute the oranges so that Grace gets at least four,
and Mary and Frank get at least two, but Frank gets no more than five? Table 9.1 lists
Table 9.1
G M F G M I
4 3 5 6 2 4
4 4 4 6 3 3
4 5 3 6 4 2
4 6 2 7 2 3
5 2 5 7 3 2
5 3 4 8 2 2
5 4 3
5 5 2
415
416 Chapter 9 Generating Functions
all the possible distributions. We see that we have all the integer solutions to the equation
¢) too +03 = 12 where 4<c),2<co,and2 <c3 <5.
Considering the first two cases in this table, we find the solutions 4 + 3 + 5 = 12 and4+
4 +4 = 12, Now where in our prior algebraic experiences did anything like this happen?
When multiplying polynomials we add the powers of the variable, and here, when we
multiply the three polynomials,
(x4 +> +x 4 x7 + x8)(x? tx txttxh tx) Ox? 4x7? 4x44 2°),
two of the ways to obtain x'? are as follows:
1) From the product x*x7x°, where x‘ is taken from (x4 +. x° + x8 +474 x8), x? from
(x? +23 +444 2%° + x°), and x? from (x7 + x? + x4 +x°).
2) From the product x4x*x*, where the first x4 is found in the first polynomial, the
second x‘ in the second polynomial, and the third x* in the third polynomial.
Examining the product
(x4 x5 tx ox? $x 8)cx2 4 Hat 5 + cx? 4 3 4 xt + 9)
more closely, we realize that we obtain the product x‘x/x* for every triple (i, j, k) that
appears in Table 9.1. Consequently, the coefficient of x!? in
f(x) = te xP 4x8 4x7 + x8)? tx taxtg x 4 2%)? 4% + x4 4 x?)
counts the number of distributions — namely, 14—that we seek. The function f(x) is
called a generating function for the distributions.
But where did the factors in this product come from?
The factor x* + x° + x®° + x’ + x°, for example, indicates that we can give Grace 4 or
5 or 6o0r7 or 8 of the oranges. Once again we make use of the interplay between the exclusive
or and ordinary addition. The coefficient of each power of x is 1 because, considering the
oranges as identical objects, there is only one way to give Grace four oranges, one way
to give her five oranges, and so on. Since Mary and Frank must each receive at least two
oranges, the other terms (x* + x? + x4 4+.x° + x®) and (x? + x7 +. x4 4+.x°) start with x?,
and for Frank we stop at x° so that he doesn’t receive more than five oranges. (Why does
the term for Mary stop at x°?)
Most of us are reasonably convinced now that the coefficient of x'!* in f(x) yields the
answer. Some, however, may be a bit skeptical about this new idea. It seems that we could
list the cases in Table 9.1 faster than we could multiply out the three factors in f(x) or
calculate the coefficient x!* in f (x). At present that may seem true. But as we progress to
problems with more unknowns and larger quantities to distribute, the generating function
will more than demonstrate its worth. (The reader may realize that the rook polynomials of
Chapter 8 are examples of generating functions.) For now we consider two more examples.
If there is an unlimited number (or at least 24 of each color) of red, green, white, and black
EXAMPLE 9.2
jelly beans, in how many ways can Douglas select 24 of these candies so that he has an
even number of white beans and at least six black ones?
The polynomials associated with the jelly bean colors are as follows:
e red (green); 1+x+x?+.---+.x%4, where the leading 1 is for 1x°, because one
possibility for the red (and green) jelly beans is that none of that color is selected
© white: (L+x?+xt+x°4---4+
2%)
e black: (x8 4+ x74 x84... 4x74)
9.1 Introductory Examples 417
So the answer to the problem is the coefficient of x74 in the generating function
Fx) =A 4+ xt xr te tard par pat ge aye txt ee +x).
One such selection is five red, three green, eight white, and eight black jelly beans. This
arises from x? in the first factor, x? in the second factor, and x® in the last two factors.
One more example before closing this section!
How many integer solutions are there for the equation c; + c2 + ¢3 + c4 = 25 if 0 < ¢; for
EXAMPLE 9.3
al 1 <i<4?
We can alternatively ask in how many ways 25 (identical) pennies can be distributed
among four children.
For each child the possibilities can be described by the polynomial 1 + x + x? +. x7 +
.. +++ x°5, Then the answer to this problem is the coefficient of x in the generating function
fx) = Ctx tre tee $x?)
The answer can also be obtained as the coefficient of x” in the generating function
g(x) = (Lt x tx? tx peep axP $x $-- 4,
if we rephrase the question in terms of distributing, from a large (or unlimited) number of
pennies, 25 pennies among four children. [Whereas f(x) is a polynomial, g(x) is a power
series in x.] Note that the terms x*, for all k > 26, are never used. So why bother with them?
Because there will be times when it is easier to compute with a power series than with a
polynomial.
b) Find the generating function for the number of ways to
EXERCISES 9.1 select, with repetitions allowed, r objects from a collection
of n distinct objects.
1. For each of the following, determine a generating function
4. a) Explain why the generating function for the number of
and indicate the coefficient in the function that is needed to solve
the problem. (Give both the polynomial and power series forms ways to have n cents in pennies and nickels is
of the generating function, wherever appropriate.) G+xt rex 4.) $x 4x 4--5,
Find the number of integer solutions for the following equa- b) Find the generating function for the number of ways to
tions: have nx cents in pennies, nickels, and dimes.
a) cy +o +03 +4
= 20,0<c¢, <7 forall 1 <i <4
5. Find the generating function for the number of integer
b) cy) ter +63 + c4 = 20,0 <c, for all 1 <i < 4, withcp solutions to the equation c; +¢.+¢3;+cq4 = 20 where
and c3 even —3<¢,,-3<@, -5 <c; <5, and0 < cy.
C) er +e. +03 +04 +5 = 30,2 <c, <4 and 3<c, <8 6. For S = {a, b, c}, consider the function
forall2<i<5
f(x) = (1 + ax) + bx)(1 + ex)
d) cy tonto; +e4+ce5 = 30,0<c, for all 1<i <5,
=1+ax+bx +cx + abx* + acx?
with c> even and c; odd
2. Determine the generating function for the number of ways
+ bex® + abex’.
to distribute 35 pennies (from an unlimited supply) among five Here, in f(x)
children if (a) there are no restrictions; (b) each child gets at
© The coefficient of x° is 1 —for the subset @ of S.
least 1¢; (c) each child gets at least 2¢; (d) the oldest child gets
at least 10¢; and (e) the two youngest children must each get at ® The coefficient of x' is a + b+ c—for the subsets {a},
least 10¢. {b}, and {c} of S.
3. a) Find the generating function for the number of ways to ® The coefficient of x? is ab + ac +bc—for the subsets
select 10 candy bars from large supplies of six different kinds. {a, b}, {a, c}. and {b, c} of S.
418 Chapter 9 Generating Functions
© The coefficient of x? is abc
— for the subset {a, b, c} = S. a) Give the generating function for the subsets of
Consequently, f(x) is the generating function for the sub- S= {a b,c,....7,5,t}.
sets of S. For when we calculate f (1), we obtain a sum wherein b) Answer part (a) for selections wherein each element can
each of the eight summands corresponds with a subset of S; be rejected or selected as many as three times.
the summand | corresponds with @. {If we go one step further
and seta = b=c = 1 in f(x), then f(1) = 8, the number of
subsets of S.]
9.2
Definition and Examples:
Calculational Techniques
In this section we shall examine a number of formulas and examples dealing with power
series. These will be used to obtain the coefficients of particular terms in a generating
function.
We start with the following concept.
Definition 9.1 Let ao, a), a2, .. . be a sequence of real numbers. The function
xO ;
f(x) = ay Fayx Fagx? +---= > a;x'
i=0
is called the generating function for the given sequence.
Where could this idea have come from?
ror -()eleCer a6)
EXAMPLE9.4 | Toranyn eZ",
+
( x =(j i}* 7 ("\x"
(00) )Ce.m.
so (1+ x)” is the g generating g function for the sequence
q
| EXAMPLE 9.5 a) Forn €Z*,
)+4+--- +2").
(d—x"™*!)=(l-x42° x4x7
So
J— x"! ) ,
i_ =Ltxtxet--- +x",
and (1 — x”*')/(1 — x) is the generating function for the sequence 1, 1, 1,..., 1, 0,
0, 0, ..., where the first 7 + 1 terms are 1.
9.2 Definition and Examples: Calculational Techniques 419
b) Extending the idea in part (a), we find that
1=(—x)(l+x4x72?4+224+x4+.-.,),
SO
1
l—-x
is the generating function for the sequence 1, 1, 1, 1,....[Notethat1/(1 ~x) =1+4+
x+x7+4+x34.--.- is valid for all real x where |x| < 1; it is for this set of values that
the geometric series 1 +x +x* +x°+--- converges. In our work with generating
functions we shall be primarily concerned with the coefficients of the powers of x.
However, later in Example 9.18, we shall use this and two other related series to
evaluate infinite sums for values within the set of values for which each such infinite
series converges. }
c) With
Figg ge x,
lx i=0
taking the derivative yields
ql = fa —x)'=(-1l)d—x)?(-l =
dx1—x dx (1 — x)?
d
=F tx teh tx tes) =142x43x7+4x34..
x
Consequently,
]
(1 ~ x)?
is the generating function for the sequence 1, 2, 3, 4,..., while
ape Ot F248 arth
—x
is the generating function for the sequence 0, 1, 2, 3,....
d) Continuing from part (c),
d x d
(Otx+2x74+3x7+--,),
dx ( ~x) dx
or
1
oy =142743'x2
4 42x34...
Hence,
x+1
(1 —x)3
generates 1*, 27, 3°,..., and
x(x +1)
(1 —x)3
generates 0°, 17, 27, 37,....
420 Chapter 9 Generating Functions
e) Now let us take one more look at the results in parts (b), (c), (d) — along with some
extensions. But this time we have a change in the notation:
fo) 1
= TF Lex tx 24 + x7 13 fe:
fi)x) =x
= x—fy fo(x) == ———;
dx°” (1— x)?
=O+x4+2x°+3x3+--.
d x? +x
fro(x) = wh (x) Gap
=
= 0? + 12x 427x274 32x34...
d we4+4x7 4x
AQ) = x fale) =
Xx (1 —x)4
=O 4+ 13x 423x774 32x74---
oy a xt + 11x74 11x? +x
Sa(x) = X75, OD) =
(1
— x)?
= 01 + I4x 4 24x74 34x34...
Now look at the output for the Maple code in Fig. 9.1. Here we find the numerators for
fo(x), fix), .... fax), along with those for fs(x) and f6(x) [where the denominators
are (1 — x)® and (1 — x)’, respectively]. The coefficients for these numerators are exactly
the Eulerian numbers we introduced in Example 4.21. We choose not to pursue this here,
but the interested reader, who wants to examine this connection further, should look into
reference [4].
£ | | 0 (2) 1/(1-x);
Vv
1
f0(x) :=——
l-x
for i from 1 to 6 do
£| | i(x) simplify (x*diff (£||(i-1) (x),x)):
print (sort (expand ((-1) * (i+1)*numer(f£||i(x))))):
od:
x
2
x +X
et 4r4x
+ llxi tll xtx
0 +26x°+66x + 26x +x
L 04570 +302 x9 +302 2° 457 +x
Figure 9.1
9.2 Definition and Examples: Calculational Techniques 421
EXAMPLE 9.6 a) Rewriting the result in part (b) of Example 9.5, we have
——Pe =l+ytytyt:::.
2443
1-y
Upon substituting 2x for y, we then learn that
Prag TIF AN + Ax + Oxy $0 = 1+ 2x $2707 42) Hoo,
— LX
so 1/(1 — 2x) is the generating function for the sequence 1 (= 2°), 2 (= 2!), 2?,
2°, ....In fact, foreacha € R, it follows that 1/(1 — ax) = 1 + (ax) + (ax)? + (ax)
+e++=1tax+a’*x*+a3x7°+---,80 1/(1 — ax) is the generating function for
the sequence 1 (= a°),a(=a'), a’, a*,.... [Here we want 0° = 1 for the case where
a=0.]
b) Again, from part (b) of Example 9.5, we know that the generating function for the
sequence 1, 1,1,1,...is f(x) = 1/(1 — x). Therefore the function
1
g(x) = f(x) —x° = x
is the generating function for the sequence 1, 1, 0, 1, 1, 1, ..., while the function
h(x) = f(x) + 2x7 3 = Tox + 2x° 3
—x
generates the sequence 1, 1,1,3,1,1,....
c) Finally, can we use the results of Example 9.5 to find a generating function for the
sequence 0, 2, 6, 12, 20, 30, 42,...?
Here we observe that
a =0=0 +0, a, =2=17 +1,
a, =6= 2742, a; = 12 = 3° 43,
ag=20=
4° +4,....
In general, we have a, = n? +n, for eachn > 0.
Using the results from parts (c) and (d) of Example 9.5, we now find that
x(x + 1) x —xX(x+)i)4+x0~-x) 9 2x
(j—x* G-xp (-x> d-x)
is the generating function for the given sequence. (The solution here depends upon
our ability to recognize each a, as the sum of n? and n. If we do not see this, we may
be unable to answer the given question. Consequently, in Example 10.6 of the next
chapter, we shall examine another technique to help us recognize the formula for ay.)
For each n € Z*, the binomial theorem tells us that (1 + x)" = (8) + (7)x + (3)x?- +
wet (rx, We want to extend this idea to cases where (a) n < O and (b) n is not necessarily
an integer.
422 Chapter 9 Generating Functions
With n,r € Z* andn > r > 0, we have
(") = nt _ na-)M@-—2)---a@-rt+)
r rifn—r)}! r! ,
ifm € R, we use
n(n -D(n—-2)---(a-rtD
r}
as the definition of (*).
Then, for example, if n ¢ Z*, we have
(57) = Penn Den cane
7 yl
_ NY@OtDO+2)---@+tr-)
| rh "
_Watr- Ml) (nti
~ aap v’( , ).
Finally, for each real number n, we define (3) = 1.
For n € Z*, the Maclaurin series expansion for (1 + x)~" is given by
EXAMPLE 9.7
(L+x)7" = 1+ (—n)x + (—n)(—n — 1)x?/2!
+ (—n)(—n — 1)(—n — 2)x3/314+.-.--
~14 y (—n)(—n — 1)(—n — 2) +--+ (-n-—rt De
{
r=] rs
_ yea(" +r— ‘)
r= r
Hence (1+x)™” = (3) + (4")x + (B)x?$-5-= HO, ({")x". This generalizes the
r
binomial theorem of Chapter 1 and shows us that (1 +.x)~” is the generating function
for the sequence (%"), (4"). (3'), (3). ----
EXAMPLE 9.8 | Find the coefficient of x* in (1 ~ 2x)~?.
With y = —2x, use the result in Example 9.7 to write (1~2x)-7? = (1+ y)7=
0 ( )Y” = Lo (G)(—2x)". Consequently, the coefficient of x° is (<’)(-2)5 =
(—1)9(7*37 ')(—32) = (32)(2) = 14,784.
| EXAMPLE 9.9 For each real number n, the Maclaurin series expansion for (1 + x)” is
1+nx +n(n —1)x?7/2!+n(n— 1)(n —2)x3/3!+---
=14 0 MG arty
I r! t
9.2 Definition and Examples: Calculational Techniques 423
Therefore,
(—1/3)(—4/3)(—7/3) - + + (—3r + 2)/3)
(+3x)'F=14+ 5° (3x y"
r!
r=]
=14 3 (—1)(—4)(—7) --- (—3r + 2) 0
r
r=]
and (1 + 3x)7'/> generates the sequence 1, —1, (—1)(—4)/2!, (-1)(—4)(-7)/3!, ...,
(—1)(—4)(-7) -- + (-3r +2)/r},....
Determine the coefficient of x! in f(x) = (x? +22 +2x44---)4,
EXAMPLE 9.10 Since (x? 4x3 +244...) =x2*(Ltx4+x74+---) = x7/(1— x), the coefficient of
x! in f(x) is the coefficient of x!> in (x7/(1 — x))* = x8/(1 — x)*. Hence the coefficient
sought is that of x’ in (1 — x)~*, namely, (3\(-b? = (-17(Ct7- '\(-1)7 = (?) = 120.
In general, for n € Z*, the coefficient of x” in f(x) is 0, when 0 <n <7. Foralln > 8,
the coefficient of x” in f(x) is the coefficient of x"~° in (1 — x)~4, which is (,~') .
(“8 = (3).
Before continuing, we collect the identities shown in Table 9.2 (on page 424) for future
reference.
The next two examples show how generating functions can be applied to derive some
of our earlier results.
In how many ways can we select, with repetitions allowed, r objects from n distinct objects?
EXAMPLE 9.11 For each of the n distinct objects, the geometric series 1 + x + x? + x? +--+ represents
the possible choices for that object (namely none, one, two, . . .) . Considering all of the n
distinct objects, the generating function is
f@=A4tx4+xr?
tx 4--5",
and the required answer is the coefficient of x” in f(x). Now from identities 5 and 8 in
Table 9.2 we have
1 \" 1 SL (nti-l\ .
(ltx+x°+x°+---)
2 3 ..eyf =
() —
Gx)" ) ( ; Js! I
i=0
so the coefficient of x” is
n+r-—1
r >
the result we found in Chapter 1.
Once again we consider the problem of counting the compositions of a positive integer
EXAMPLE 9.12
n —this time using generating functions.
Start with
=xtx
tei tatte--
l1-—x
424 Chapter 9 Generating Functions
Table 9.2
For allm,n ¢ Z*,aeR,
1 (+ x)" =) + Gxt Gx? + + Gx”
2) (1 +.ax)” = (f) + (fax + G)atx? +--+ + (p)atx"
3) +x") = (7) + (t)x” ae (3)x?" meee (")x"™
4) (L— x) /(l x) = Ltx tx? 4--- +x"
5) /(—-x)=ltxtx? taxi te.. = Px!
6) 1/(1 ~ ax) = 1+ (ax) + (ax)?
+ x)? +.
= Dylan! = DP at
=i+axta*x*+a7xi+---
7) U/C +x)" = (G+ (Pat (Qe? +
= dino (7)!
=14 (DC 4H )x tae
ty atte
= reo DIC; > Da!
8) 1/2) = (G+ (VT) 4+ GY) +
= ES ACw
=14+(-D(CtE Neat yr t2 Yeap te
a OO fr+i~dy
= 2i=0 i )x
If f(x) = ey aix', g(x) = S725 bx! and h(x) = f (x)g(x), then
h(x) = 5°72, c7x!, where for all k > 0,
k
CE = dgby + aybp_y +++ + ape) + apbo = So ajby-;.
j=0
where, for example, the coefficient of x* is 1, for the one-summand composition of 4—
namely, 4. To obtain the number of compositions of n where there are two summands,
we need the coefficient of x" in(x +x? +45 +x4+.---)? =[x/( —x)P = x7/C1 — x).
Here, for instance, we obtain x* in (x + x7 +22 4+274-.- -)” from the products x! x3,
x? -x?, and x°-x!. So the coefficient of x*+ in x7/(1 — x)? is 3—for the three two-
summand compositions 1 + 3, 2 + 2, and3 + 1 (of 4). Continuing with the three-summand
compositions we now examine (x + x7 + x3 +x4+4---)? =[x/Q—- xP = 2° / — xy.
Once again we look at the ways x* comes about — namely, from the products x! - x! - x,
x!.x?. x!) x2. x! x! So here the coefficient of x* is 3, which accounts for the composi-
tions 1+1+2,1+2+4+1,and2+1+ 1 (of 4). Finally, the coefficient of x4 in (x + x4
xetxt4...)4 = [x/(1—x)]* =x4/C — x)* is 1 — for the one four-summand compo-
sition 1 +1+1-+ 1 (of 4).
The results in the previous paragraph tell us that the coefficient of x* in }°}_ | [x/(1 — x)]'
is 1+3+43-+41=8 (= 2°), the number of compositions of 4. In fact, this is also the
coefficient of x* in ye i[x/(1 — x)]'. Generalizing the situation we find that the number
of compositions
p of a Pp positive integer
g m is the coefficient of x” in the g generating & function
9.2 Definition and Examples: Calculational Techniques 425
fx) = 2%, [e/C — x)]!. But if we set y = x/(1 — x), it then follows that
ro By By (5)- (Dt)Lda
=x/(1 —2x) =x[1 + (2x) + (2x)? + (2x)? +- ++]
i-x
= 2 4+ 2x4 eee,
+ 2!x27 42273
So the number of compositions of a positive integer n is the coefficient of x” in f(x) — and
this is 2”~' (as we found earlier in Examples 1.37, 3.11, and 4.12.)
EXAMPLE 9.13 Before we look at any specific compositions, let us start by examining identity 4 in Table
. 9.2. When x is replaced by 2 in this identity, the result tells us that for all n € Z*, 1+
2427 4..-4+2" = (1 —2"*')/(1 — 2) = 2"*! — 1. [This result was also established by
the Principle of Mathematical Induction — in part (a) of Exercise 2 for Section 4.1.] All
well and good— but where would one ever use such a formula? In Table 9.3 we find the
special compositions of 6 and 7 that read the same left to right as right to left. These are the
palindromes of6 and 7. We find that for7 there are 1 + (1 +244) =14+(14+2!'+27) =
1 + (23 — 1) = 23 palindromes. There is one palindrome with one summand
— namely, 7.
There is also one palindrome where the center summand is 5 and where we place the one
composition of 1 on either side of this summand.
Table 9.3
1) 6 (1) 1) 7 (1)
2) 1+4+1 (1) 2) 14541 (1)
3) 24242 2) 3) 24342
+ 2)
4 141424141 4) 14+14+34+141
5) 343 5) 34143
6) 1424241 ’ 6) 142414241 4)
7) 2+14+1+2 7) 2+1+14+1+4+2
8) 14141414141 8) 1414141414141
For the center summand 3 we place one of the two compositions of 2 on the right (of 3)
and then match it on the left, with the same composition, in reverse order. This procedure
provides the third and fourth palindromes of 7 in the table. Finally, when the center summand
is 1, we put a given composition of 3 on the right of this 1 and match it on the left with the
same composition, in reverse order. There are 27~' = 4 compositions of 3, so this procedure
results in the last four palindromes of 7 in the table.
The situation is similar for the palindromes of 6 except for the case where, instead of 0
as the center summand, a plus sign appears in the center. Here we obtain the last 2*~' = 4
palindromes of 6 in the table — one for each composition of 3. Summarizing for n = 6 we
have
i) Center summand 6 1 palindrome
ii) Center summand 4 1 (= 2!~') palindrome
iii) Center summand 2 2 (= 2?-') palindromes
iv) Plus sign at the center 4 (= 23!) palindromes
426 Chapter 9 Generating Functions
So there are 1 + (1 + 2! + 2?) = 1+ (23 — 1) = 2? palindromes for 6.
Now we look at the general situation. For n = | there is one palindrome. If n = 2k + 1,
for k € Z*, then there is one palindrome with center summand n. For 1 <1 <k, there
are 2'—' palindromes of n with center summand n — 2t. (One palindrome for each of the
2'-! compositions of f.) Hence the total number of palindromes of n is 1+ (1 +2!'+
274... 42k!) = 14 (2 — 1) = 2k = 2-9/2, Now consider n even, say n = 2k, for
k € Z*. Here there is also one palindrome with center summand n and, for 1 <s <k —1,
there are 2°~' palindromes of n with center summand n — 2s. (One palindrome for each
of the 2°~' compositions of s.) In addition, there are 2‘—! palindromes where a plus sign
is at the center. (One palindrome for each of the 2*—' compositions of k.) In total, n has
14+ (142) 42? 4...42%-* 4 2k) = 1 4 2" — 1) = 2* = 2"/2 palindromes.
The preceding results can be simplified. Observe that for n ¢ Z*,n has 2!"/7! palin-
dromes.
Having dealt with compositions (once again) and palindromes, we continue at this point
with some additional examples dealing with generating functions.
In how many ways can a police captain distribute 24 rifle shells to four police officers so
EXAMPLE 9.14
that each officer gets at least three shells, but not more than eight?
The choices for the number of shells each officer receives are given by x° + x4-+ +--+ +
x®. There are four officers, so the resulting generating function is
FA) = OP Fah pet x8),
We seek the coefficient of x74 in f(x). With (x? +4 4+---4+%8)4 = xP tx ta? 4
2+) x5) = x!?((1 — x®)/(1 — x))*, the answer is the coefficient of x? in (1 — x®)*.
may t= [= (ah Ee? = Get 42] (6) + G9 + (Boa +h
which is [(73)(—1)"? = ()(e)(—D® + (6) C0)] = L02) — ()(@) + @)] = 125.
| _ EXAMPLE 9.15 Verify that for alln e Z*, 2") = 0", (")’.
Since (1 + x)” =[(1 +x)"}*, by comparison of coefficients (of like powers of x),
the coefficient of x" in (1+), which is (7"), must equal the coefficient of x” in
nh
[() + (a+ Gatto + Gxt], and this is (5)() + (Gh) + Vita) ++
(7) (5). With (") = (,",), for all 0 <r <2, the result follows.
nr
1
Determine the coefficient of x® in
EXAMPLE 9.16 (x — 3)(x — 2)?”
Since 1/(x —a) = (—1/a)(1/(1 ~ (x/a))) = (-1/a)[1 + (/a) + (/ay? +--+) for
any a # 0, we could solve this problem by finding the coefficient of x® in
1/[(x — 3)(x — 2)?] expressed as (—1/3)[1 + (x /3) + («/3)? +++ 10/4) [(G) +
(7°) (—x/2) + (G2) (-x/2)? + - + ).
An alternative technique uses the partial fraction decomposition:
1 A B Cc
= + + ,
(x —3)(x-—2)? x2x-3 2-2 (4-2)?
This decomposition implies that
1 = A(x — 2)? + B(x — 2)(x — 3)
+ Cx — 3),
9.2 Definition and Examples: Calculational Techniques 427
Or
O-x?+ 0-x4+1=1=(A4+B)x?+(-44—-—S5B4+C)x
+ (444+ 6B —3C).
By comparing coefficients (for x7, x, and 1, respectively), we find that A+ B = 0,
—4A —5B+C =0,and4A + 6B — 3C = 1. Solving these equations yields A = 1, B =
—1, and C = —1. Hence
1 1 1 |
(x—3)(x—-2)2 x-3 x-2 (x—2)
(250s Gate
“(G)EG) *QLG)
(QQ-OQa-A(a+-]
The coefficient of x® is (—1/3)(1/3)® + (1/2)(1/2)8 + (-1/4)(4)(-1/2)8 =
— [(1/3)? + 701 /2)"°].
Use generating functions to determine how many four-element subsets of § = {1, 2,3,...,
EXAMPLE 9.17
15} contain no consecutive integers.
a) Consider one such subset (say {1, 3, 7, 10}), and write 1 <1<3<7<10< 15. We
see that this set of inequalities determines the differences 1 — 1 = 0,3 —1=2,7—-
3 =4,10—7 =3, and 15 — 10 =5, and these differences sum to 14. Considering
another such subset — say (2, 5, 11, 15}, we write 1 <2 <5 <11< 15 < 15; these
inequalities yield the differences 1, 3, 6, 4, and 0, which also sum to 14.
Turning things around, we find that the nonnegative integers 0, 2, 3, 2, and 7 sum to
14 and they are the differences that arise from the inequalities 1<1<3<6<8<15
(for the subset {1, 3, 6, 8}).
These examples suggest a one-to-one correspondence between the four-element
subsets to be counted and the integer solutions to c) + cz + c3 + ¢4 +s = 14 where
0<c), ¢5, and 2 < c, c3, cy. (Note: Here c, c3, c4 > 2 guarantee that there are no
consecutive integers in the subset.) The answer is the coefficient of x!4 in
f(x)=A4x4x74x34--)@?
tro tatg- Pd txt xr trt--.
= x®(1—x)>.
This then is the coefficient of x* in (1 — x)~>, which is (3)(-1)® = orn y=
(2)
= 495.
b) Another way to look at the problem is as follows.
For the subset {1, 3, 7, 10}, we examine the strict inequalities O0< 1<3<7<
10 < 16 and consider how many integers there are strictly between each successive
pair of these numbers. Here we get 0, 1, 3, 2, and 5: 0 because there is no integer
between 0 and 1, 1 for the integer 2 between 1 and 3, 3 for the integers 4, 5, 6 between
3 and 7, and so on. These five integers sum to 11. When we do the same thing for the
subset {2, 5, 11, 15}, the strict inequalities 0 < 2 <5 < 11 < 15 < 16 yield the results
1, 2, 5, 3, and 0, which also sum to 11.
428 Chapter 9 Generating Functions
On the other hand, we find that the nonnegative integers 0, 1, 2, 1, and 7 add up to
11 and they arise as the numbers of distinct integers between the integers in the five
successive strict inequalities 0 < 1 <3 <6 < 8 < 16. These correspond to the subset
{1, 3, 6, 8}.
These results suggest a one-to-one correspondence between the desired subsets
and the integer solutions to 5; + by + b3 + by + bs = 11, where 0 < by, bs and | <
bz, b3, by. (Note: In this case, b), b2, b3 > 1 guarantee that there are no consecutive
integers in the subset.) The number of these solutions is the coefficient of x!! in
g(x) = txtxrt-. )(Qxtar2ta3 te BU tx tx?
4+ ---)
= x3(1—x)7>"
The answer is (y) (—1)8 = 495, as above. (The reader may now wish to look back
at Supplementary Exercise 15 in Chapter 3.)
Our next example takes us back to the optional material in Chapter 3 where we first
encountered the idea of the sample space. But now that we know about generating functions
we will be able to deal with a sample space that is discrete but ot finite — that is, a countably
infinite’ sample space.
a) Suppose that Brianna takes an actuarial examination until she passes it. Further, sup-
EXAMPLE 9.18"
pose the probability that Brianna passes the examination on any given attempt is 0.8
and that the result of each attempt, after the first, is independent of any previous at-
tempt. If we let P denote “pass” and F denote “‘fail”, for any given attempt, then here our
sample space may be expressed as & = {P, FP, FFP, FFFP, . . .}, where, for example,
Pr (FFP) — the probability Brianna fails the exam twice before she passes it — is given
by (0.2)? (0.8). In addition, the sum of the probabilities for the outcomes in & is (0.8) +
(0.2)(0.8) + (0.2)?(0.8) + (0.2)3(0.8) ++» + = 3°2,(0.2)'(0.8) = (0.8) 5°72, (0.2)!
= (0.8) (5) = (0.8) (53) = 1,asitshould be — for according to the second axiom
of probability (in Section 3.5) we expect Pr(f) = 1. [Note that ey (0.2)! = on
follows from the result in part (b) of Example 9.5. The given geometric series con-
verges to ~—p5 because |0.2] < 1.]
b) Now suppose we want to know the probability Brianna passes the exam on an even-
numbered attempt. That is, we want Pr(A) where A is the event {FP, FFFP. . .}.
At this point let us introduce the discrete random variable Y where Y counts the num-
ber of attempts up to and including the one where Brianna passes the exam. Then the
probability distribution for Y is given by Pr(Y = y) = (0.2)-'(0.8), y > 1. So Pr(A)
can be determined as follows: Pr(A) = SI Pr(Y = 2i) = v1 (0.2)7!(0.8) =
(0.8) D072 (0.2)! = 0.8[(0.2) + (0.2)? + (0.2) ++ - -] = (0.8)(0.2)[1 + (0.2)? +
(0.2)* +--+] = (0.8)(0.2) — oD? = ONO) — :. And once again we have used the
result in part (b) of Example
9.5, this time withx = (0.2)*, where |(0.2)?| = |0.04| < 1.
"The reader can learn more about countably infinite sets from the material in Appendix 3.
* This example uses material from the optional sections of Chapter 3. It may be skipped without any loss of
continuity.
9.2 Definition and Examples: Calculational Techniques 429
c) Continuing with Y, now we’d like to find E(Y), the number of times Brianna expects
to take the actuarial exam before she passes it. To determine £(Y) we'll start with
the formula 1/(1 —#) =1+1r+1°+13+.--- and go one step further. Taking the
derivative of both sides, we find [as in Example 9.5(c)] that
_pa—p2-y-
(—1)(1
—t)-*(-1) —L_
Gop? = #/_1_
a lion |= 1+2t4+3f°+4t+ Papa.
where this series likewise converges’ for |t| < 1. Therefore,
E(Y) =) yPr(¥ = y) = )) 90.271 0.8)
y= y =]
= (0.8) } > y(0.2)?! = (0.8)[1 + 2(0.2) + 3(0.2)° + 40.2)? +--+]
y=)
1 5
= (0.8)
So Brianna expects to take the exam 1.25 times before she passes it.
d) Finally, to determine Var(Y) we first want to find E(Y7). To do so we first multiply
the result in part (c) by ¢ and find [as in Example 9.5(c)] that
Ton =r42P 43444...
Differentiating both sides of this equation now gives us
(=1)°)-1@a—-1H(-1l)_ ltt _ dd t
(1 —t)4 d—t) dt|ad-—rt?
=? 42774344
4---,
and this series is also convergent* for |r| < 1. So now we have
oo
E(Y) = 0 y°Pr¥ = y) = >° y?0.2)" 10.8)
y=l y=l
= (0.8) }> y°(0.2)"! = O.8)f1? + 270.2) + 370.2)? + 70.29 +++]
-00 [3295]
y=l
140.2 1.2 15
(1—0.2)3} (0.8) 8°
‘Using the Ratio Test from calculus, one finds that
(n+ 1)t" _ Atd . 1
lim
nt?!
= |t) lim —— =[t| lim (14 — } = Ie](1)
= I].
A> OC NOOO n ASO n
When ¢ = +1, limy+o nt”! # 0, so the series does not converge for f = + 1. Consequently, this infinite series
converges for |¢| < 1.
*Once again we use the Ratio Test from calculus. Here
limOO
A+
(n + 1)72”
n2zpr-! it fim, OP 1)? =
n> OO n R00
tim (1+2)1\? = ey? =i
a
When t = +1, limp. n*t"~! #0, so the series does not converge for? = + 1, Consequently, this infinite series
converges for |¢| < 1.
430 Chapter 9 Generating Functions
Consequently,
2
Var(¥) = E(Y?) —{E(Y)P = ; - (3) =
The preceding example introduced us to a new discrete random variable — namely, the
geometric random variable. In this situation we perform a Bernoulli trial until we are
successful (for the first time). As with the binomial random variable the outcome of each
trial, after the first, is independent of the outcome for any previous trial. Further, the proba-
bility of success for each Bernoulli trial is p, and the probability of failure is g = 1 — p.
If we let the random variable Y count the number of trials until we are finally successful,
then Y is a discrete random variable with probability distribution given by
Pr(¥=y)=q
|p, y=1,2,3,....
In addition, we find that
E(Y) = -
| and Var(Y) = —.
q
p Pp
The following example uses the last identity in Table 9.2. (This identity was used earlier in
Examples 9.14 and 9.15 — but rather implicitly.)
Let f(x) =x/(1 — x)’. This is the generating function for the sequence ag, a1, d2,...,
EXAMPLE 9.19
where a; = k for all k € N. The function g(x) = x(x + 1)/(1 — x)? generates the sequence
bo, bj, bz, ..., forby = k?, KEN.
The function h(x) = f(x)g(x) consequently gives us agbo + (agb, + ayby)x +
(dob + a,b, + azby)x* +-:++,80 h(x) is the generating function for the sequence cy, ¢),
c2,..., where foreach k EN,
Ce = andy + ay by + agbg-2 +++ + + Gg_2b2 + ag_1b) + abo.
Here, for example, we find that
co = 0-07 =0
cy =0-1°4+1-0?
=0
o=0-2?
41-12 +2-0 =1
c= 0-3°41-2742-17
43-0 =6
and, in general, c, = }°*_, i(k —i)*. (We shall simplify this summation formula in the
Section Exercises.)
Whenever a sequence cy, C1, C2,... arises from two generating functions f(x) [for
ay, 4, a2, ...Jand g(x) [for bo, b;, bz, . . .], as in this example, the sequence co, ¢1, C2, ...
is called the convolution of the sequences ay, a), a2, ... and by, bi, bo, ....
Our last example provides one more instance of the convolution of sequences.
+--+ and g(x)=1/(l+x) =1-x4+2x°-
For f(x) =1/(d~—x)=lt+xtx74+x3
EXAMPLE 9.20
we.--,
x>+.- find that
fxg) = 1/[d —- At x)= 1/d-x?) =14t xo
x2 4x44 4...
9.2 Definition and Examples: Calculational Techniques 431
Consequently, the sequence 1, 0, 1, 0, 1, 0, . . . is the convolution of the sequences 1, 1, 1,
1,1,1,...and1,—-1,1,—-l,1,—1,....
8. Forn € Z*, find in (1 + x + x7)(1 + x)" the coefficient of
EXERCISES 9.2 (a) x’; (b) x®; and (c) x” forO <r <n+2,reZ.
1. Find generating functions for the following sequences. 9. Find the coefficient of x'> in each of the following.
[For example, in the case of the sequence 0, 1, 3, 9, 27,..., a) x3(i — 2x)!
the answer required is x/(1 — 3x), not }°™,, 3'x'*! or simply b) (3 — 5x)/(1 — x)?
O+x4+3x?
+ 9x3 4---,]
c) (1 +x)*/ — x)
a) (0). (3). @). ++.G) 10. In how many ways can two dozen identical robots be as-
b) (i), 2), 3G). ---. 8G) signed to four assembly lines with (a) at least three robots as-
ec) 1,-1,1,-1,1,—-1,... signed to each line? (b) at least three, but no more than nine,
d) 0, 0, 0, 6, —6, 6, —6, 6,... robots assigned to each line?
e) 1,0,1,0,1,0,1,... 11. Inhow many ways can 3000 identical envelopes be divided,
f) 0,0, 1,a,a*,a,...,a
40
in packages of 25, among four student groups so that each group
gets at least 150, but not more than 1000, of the envelopes?
2. Determine the sequence generated by each of the following
generating functions. 12. Two cases of soft drinks, 24 bottles of one type and 24 of an-
other, are distributed among five surveyors who are conducting
a) f(x) = Qx — 3) b) f(x) =x4/0 — x) taste tests. In how many ways can the 48 bottles be distributed
ce) f(x) =x°/ — x’) d) f(x) = 1/(. + 3x) so that each surveyor gets (a) at least two bottles of each type?
e) f(x) = 1/3 —-~x) (b) at least two bottles of one particular type and at least three
f) f(x) = 1/0 — x) + 3x? - 11 of the other?
3. In each of the following, the function f(x) is the generating 13. If a fair die is rolled 12 times, what is the probability that
function for the sequence ap, a), a2, ..., whereas the sequence the sum of the rolls is 30?
by, 6), bx, ... is generated by the function g(x). Express g(x) 14. Carol is collecting money from her cousins to have a party
in terms of f(x). for her aunt. If eight of the cousins promise to give $2, $3, $4,
a) b, = 3 or $5 each, and two others each give $5 or $10, what is the
by, =G,,n
EN, n #3 probability that Carol will collect exactly $40?
b) b; = 15. In how many ways can Traci select n marbles from a large
bo =7 supply of blue, red, and yellow marbles (all of the same size) if
b, =a,,n
EN, n #3,7 the selection must include an even number of blue ones?
c) b) = 1 16. How can Mary split up 12 hamburgers and 16 hot dogs
b, =3 among her sons Richard, Peter, Christopher, and James in such
b, = 2a,,nEN,n
41,3 a way that James gets at least one hamburger and three hot dogs,
d) b,) = 1
and each of his brothers gets at least two hamburgers but at most
five hot dogs?
b, =3
by =7 17. Verify that(1 — x — x? — x3 ~ x4 — x5 — x°)~! is the gen-
b, = 2a, +5,n€N,n#1,3,7 erating function for the number of ways the sum, where n €N,
4. Determine the constant (that is, the coefficient of x°) in can be obtained when a single die is rolled an arbitrary number
(3x? — (2/x)). of times.
5. a) Find the coefficient of x’ in 18. Show that (1 — 4x)~'/? generates the sequence (*"), n € N.
(txtx74x3 4... 19. a) If a computer generates a random composition of 8,
what is the probability the composition is a palindrome?
b) Find the coefficient of x’ in
b) Answer the question in part (a) after replacing 8 by n, a
(tx+x?+x°+--.)"forn eZ. fixed positive integer.
6. Find the coefficient of x°° in (x7 + x8 +x? +---)®. 20. a) How many palindromes of 11 start with 1? with 2? with
7, Find the coefficient of x7? in (x? + x3 +444 x54 x°P. 3? with 4?
432 Chapter 9 Generating Functions
b) How many palindromes of 12 start with 1? with 2? with c) Find the subset of S$ that determines the differences
3? with 4? a, b,c, d, and e, where 0 < a, e, and2 < b, c, d.
21. Let n be a (fixed) positive integer, with n > 2. If 1<t< 30. In how many ways can we select seven nonconsecutive
[n/2|, how many palindromes of n start with 7? integers from {1, 2, 3,..., SO}?
22. Let n € Z*, n odd. Can a palindrome of n have an even
31. Use the following summation formulas to simplify the ex-
number of summands?
pression for c, in Example 9.19:
23. Letn € Z*, n even. How many palindromes of n have an
even number of summands? How many have an odd number of k(k+ 1)
2 4
summands?
24. Determine the number of palindromes of n, where all sum- k k
k(k + 1)(2k + 1
mands are even, for (a) n = 10; (b) nm = 12; and (c) n even. ies pre-e and
25. Shay rolls a fair die until she gets a 6. If the random vari-
k : 2
able Y counts the number of times Shay rolls the die until she
yr=ypr=Seey
gets her first 6, determine (a) the probability distribution for Y;
(b) E(Y); and (c) ay.
32. a) Find the first four terms cy, c), C2, and c3 of the convo-
26. Referring back to the preceding exercise, what is the prob- lutions for each of the following pairs of sequences.
ability Shay rolls her first 6 on an even-numbered roll?
i) a, = 1, b, = 1, forallneN
27. Leroy has a biased coin where Pr(H) = 2 and Pr(T) = &. li) a, = 1, b, = 2", foralln eN
Assuming that each toss, after the first, is independent of any iii) dg = 4) = 4, =a3=1; a, = O, neNn,
previous outcome, if Leroy tosses the coin until he gets a tail,
n#0,1, 2,3; 5, = 1, forallaeN
what is the probability he tosses it an odd number of times?
b) Find a general formula for c,, in each of the results of
28. If Y is a geometric random variable with E(Y) = i,
part (a).
determine (a) Pr(Y = 3); (b) Pr(Y > 3); (c) Pr(¥ > 5);
(d) Pr(¥ > 5|Y > 3); (e) Pr(Y > 6|¥ > 4); and (f) oy. 33. Find a formula for the convolution of each of the following
pairs of sequences.
29. Consider part (a) of Example 9.17.
a) a, =1,0<n<4,a, =0,foralln>5;
a) Determine the differences for the inequalities that re-
b, =n, forallneN
sult from the subset {3, 6, 8, 15} of S, and verify that those
differences add to the correct sum. b) a, = (—1)", b, = (—1)", for alln EN
b) Find the subset of S$ that determines the differences 2,
2, 3, 7, and 0.
9.3
Partitions of Integers
In number theory, we are confronted with partitioning a positive integer 1 into positive
summands and seeking the number of such partitions, without regard to order. This number
is denoted by p(n). For example,
pQjy=1: 1
p(2)=2: 2=1+4+1
pB3)=3: 3=2+1=14+1+4+1
p4=5: 4=341=242=24141=141414+1
pS) =7: 5=44+1=342=34141=24241
=2414+141=1414+14+141
We should like to obtain p() for a given n without having to list all the partitions. We
need a tool to keep track of the numbers of 1’s, 2’s, ..., ’s that are used as summands
for n.
9.3 Partitions of Integers 433
Ifn € Z*, the number of 1’s we can use is 0 or 1 or 2 or... . The power series 1 + x +
x? +x3 +x4+4.-.- keeps account of this for us. In like manner, 1 +x? +x4+x°+--.
keeps track of the number of 2’s in the partition of n, while 1 +2%7+x°+x°+---
accounts for the number of 3’s. Therefore, in order to determine p(10), for instance,
we want the coefficient of x!° in f(x) =(1t+x+a7 4239+) Jd 4x7 427494
SL txt $ x84 x9 $e. ee Lt x 4574. oring(x) = (Ltx4+x7 4374+
se xl) Hartt. +x! txF $204 x7). + (4+x"%).
We prefer to work with f(x), because it can be written in the more compact form
1 1 1 -TI l
f(x)=
d-xyd-x)d—-x3) d—x) ea — x!)
If this product is extended beyondi = 10, we get P(x) = H.u/a — x')], which gener-
ates the sequence p(0), p(1), p(2), p(3),..., where we define p(0) = 1.
Unfortunately, it is impossible to actually calculate the infinite number of terms in the
product P(x). If we consider only [];-,[1/(1 — x‘)] for some fixed r, then the coefficient
of x” here is the number of partitions of n into summands that do not exceed r,
Despite the difficulty in calculating p(n) from P(x) for large values of n, the idea of the
generating function will be useful in studying certain kinds of partitions.
Find the generating function for the number of ways an advertising agent can purchase n
EXAMPLE 9.21 minutes (n € Z*) of air time if time slots for commercials come in blocks of 30, 60, or 120
seconds.
Let 30 seconds represent one time unit. Then the answer is the number of integer solutions
to the equation a + 2b + 4c = 2n withO <a, b,c.
The associated generating function is
Fx) (ht xtx?te- JG txt tate tat tah +--+)
(I
1 1 1
l—-x l—x? 1-—x4’
and the coefficient of x2" is the number of partitions of 2n into 1’s, 2’s, and 4’s, the answer
to the problem.
Find the generating function for p,(n), the number of partitions of a positive integer n into
EXAMPLE 9.22
distinct summands.
Before we start, let us consider the 11 partitions of 6:
I 1+14+14+14+141 2) 141414142
3) 14+14+1+3 4, 1+1+4
5) 1+14+242 6) 1+5
7) 14+243 8) 2+2+4+2
9) 2+4 10) 343
11) 6
Partitions (6), (7), (9), and (11) have distinct summands, so py(6) = 4.
434 Chapter 9 Generating Functions
In calculating pg(n), for each k € Z* there are two choices: Either k is not used as one
of the summands of », or it is. This can be accounted for by the polynomial 1 + x*, and
consequently, the generating function for these partitions is
8
Pye) = (L4x)1 4x20 4+2x7)---= [Jd 42°).
i l
i
For each n € Z*, p(n) is the coefficient of x” in (1 + x)(1 +x?)---(1 +x"). [We
define p,(0) = 1.] When n = 6, the coefficient of x° in (1 + x)(1 + x2)-+- (1 + x9) is 4,
Considering the partitions in Example 9.22, we see that there are four partitions of 6 into odd
EXAMPLE 9.23
summands: namely, (1), (3), (6), and (10). We also have p,(6) = 4. Is this a coincidence?
Let p,(n) denote the number of partitions of n into odd summands, whenn > 1. We define
Po(0) = 1. The generating function for the sequence p,(0), po(1), po(2), ... is given by
Poxy=(1tx4x?txPt--
JE xe tx te. t¢ x5 +x 4+...).
1 1 1 ]
(alte = l—-x3}—x5 Joy?
Now because
1 — x? 1—x4 1 —xé
l+x= Tox’ l+x°= 5, L+x = ,, Lees
we have
Pax) = (tx) +x) 42°) 4+ x4) ---
wintiestiesies
l~x l1-~x? 1-x3 1—x4
oo)I~x1l—x3
1g owe
From the equality of the generating functions, py(n) = p,(n), for all n > 0.
Once again we shall permit only odd summands, but in this example each such (odd)
EXAMPLE 9.24
summand must occur an odd number of times — or not at all. Here, for example, there is
one such partition of the integer 1— namely, 1— but there are no such partitions of the
integer 2. For the integer 3 we have two of these partitions: 3 and 1 + 1 + 1. When we
examine the possibilities for the integer 4, we find the one partition 3 + 1.
The generating function for the partitions described here is given by
f@)= (tx 40404 - JF ex? $xP $-- JU F 0 4x5 4x5 4..)--.
OO oO
_ I] 1+ So PH DeHD
k=0 i=0
The g generating g function is not g givenb y
(xtrPHrePte-jortxP tub $e. jad t xb 4x5 4...)... (x)
If it were, then the product could not contain any terms where x would appear to a finite
power. The situation given by equation (*) would occur if we were to believe that every
odd positive integer must appear as a summand at least once. And in such a “partition” the
number of summands and the sum itself would both be infinite. Consequently, whether or
not it is stated, we must realize that each odd summand may not appear at all — and this
condition is accounted for by the (first) summand, 1 = x, that appears in each factor of
9.3. Partitions of Integers 435
f (x). In fact, for all but a finite number of odd summands, this is the case. Of course, when
an odd summand does appear in a partition, it does so an odd number of times.
7
We close this section with an idea called the Ferrers graph. This graph uses rows of dots
to represent a partition of an integer where the number of dots per row does not increase as
we go from any row to the one below it.
In Fig. 9.2 we find the Ferrers graphs for two partitions of 14:(a)4+3+3+2+1+41
and (b) 6+ 4+ 3+ 1. The graph in part (b) is said to be the transposition of the graph in
part (a), and vice versa, because one graph can be obtained from the other by interchanging
rows and columns.
e e e@ e @ ® e e e @
oe e@ e e e e e
e e e e e e
e ® e
(b)
e
° (a)
Figure 9.2
These graphs often suggest results about partitions. Here we see a partition of 14 into
summands, where 4 is the largest summand, and a second partition of 14 into exactly
four summands. There is a one-to-one correspondence between a Ferrers graph and its
transposition, so this example demonstrates a particular instance of the general result: The
number of partitions of an integer n into 7 summands is equal to the number of partitions
of n into summands where m is the largest summand.
6. What is the generating function for the number of partitions
EXERCISES 9.3 of n € N into summands that (a) cannot occur more than five
times; and (b) cannot exceed 12 and cannot occur more than
1. Find all partitions of 7.
five times?
2. Determine the generating function for the sequence ag, «1,
7. Show that the number of partitions of a positive integer n
a, ..., where a, is the number of partitions of the nonnegative
where no summand appears more than twice equals the number
integer n into (a) even summands; (b) distinct even summands;
of partitions of nm where no summand is divisible by 3.
and (c) distinct odd summands.
8. Show that the number of partitions of n € Z* where no
3. In f(x) = [1/1 — x) 1/0 — x?)][1/d1 — x3)], the coef-
summand is divisible by 4 equals the number of partitions of n
ficient of x° is 7. Interpret this result in terms of partitions
where no even summand is repeated (although odd summands
of 6.
may or may not be repeated).
4. Find the generating function for the number of integer so-
9, Using a Ferrers graph, show that the number of partitions
lutions of
of an integer n into summands not exceeding m is equal to the
a) 2w+3x+5y+7z=n, O<w,x,y,z number of partitions of n into at most m summands.
b) 2w+3x+5y+7z2
=n, O<w, 4<x,y, 5x2 10. Using a Ferrers graph, show that the number of partitions
5. Find the generating function for the number of partitions of n is equal to the number of partitions of 2 into n summands.
of the nonnegative integer n into summands where (a) each
summand must appear an even number of times; and (b) each
summand must be even.
436 Chapter 9 Generating Functions
9.4
The Exponential Generating Function
The type of generating function we have been dealing with is often referred to as the ordinary
generating function for a given sequence. This function arose in selection problems, where
order was not relevant. However, turning now to problems of arrangement, where order is
crucial, we seek a comparable tool. To find such a tool, we return to the binomial theorem.
For each n € Zt, (1 +x)" = (9) + ({)x + (G)x? +--+ + (2)x", so (1 + x)" is the (or-
dinary) generating function for the sequence (5), (7), (3), ..., ("), 0, 0, .... When dealing
with this idea in Chapter 1, we also wrote (”) = C(n, r) when we wanted to emphasize that
(”) represented the number of combinations of n objects taken r at a time, with O <r <n.
Consequently, (1 + x)” generates the sequence C(n, 0), C(n, 1), C(n, 2),..., C(n,n),
0,0,....
Now for all O <r <n,
n} 1
C(n,r) = rin =r)! = (;;) P(n,r),
where P(n, r) denotes the number of permutations of n objects taken r at a time. So
(1 +x)" = C(n, 0) + Ca, 1)x + C(n, 2)x? + C(n, 3)x8 +--+ C(n, n)x”
x? ras x”
= P(n, 0) + P(n, 1)x + Pla, YF + P(n, ry +---4+ P(n, ny
Hence, ifin (1+ x)” we consider the coefficient of x’/r!, with 0<r<n, we obtain
P(n, r). On the basis of this observation, we have the following definition.
Definition 9.2 For a sequence dp, a), a2, 43, .. . of real numbers,
x2 x3 xo yi
F(X) = ag + ax + a2~ +03>- +-°-= ) as,
2! 3! i!
i=0
is called the exponential generating function for the given sequence.
Examining the Maclaurin series expansion for e*, we find
EXAMPLE 9.25
x? x3 x4 Sx!
Msltxty+y+gt
x= —_ — — ee
LT —_
so e* is the exponential generating function for the sequence 1, 1, 1, ... . (The function e*
is the ordinary generating function for the sequence 1, 1, 1/2!, 1/3!, 1/4!,....)
Our next example shows how this idea can help us count certain types of arrangements.
EXAMPLE 9.26 | In how many ways can four of the letters in ENGINE be arranged?
In Table 9.4 we list the possible selections of size 4 from the letters E, N, G, I, N, E,
along with the number of arrangements those four letters determine.
We now obtain the answer by means of an exponential generating function. For the
letter E we use [1 +x + (2°? /2!')] because there are 0, 1, or 2 E’s to arrange. Note that
the coefficient of x*/2! is 1, the number of distinct ways to arrange (only) two E’s. In like
9.4 The Exponential Generating Function 437
Table 9.4
EEN N At /(2! 2!) E GUNN 4! /2!
EEGHN At /2! E I NN 4! /2!
EE IN 4!/2! G I N N 4! /2!
EEG I 4t/2! E I GN 4!
manner, we have [1 + x + (x?/2!)] for the arrangements of 0, 1, or 2 N’s. The arrangements
for each of the letters G and I are represented by (1 + x).
Consequently, we find here that the exponential generating function is
f(x) =[L4+x 407/297 +2),
and we claim that the required answer is the coefficient of x*/4lin f(x).
In order to motivate our claim, let us consider two of the eight ways in which the term
x1 /4! arises in the expansion of
fx) =[L4x4¢ 7/29] 4x + 7/2910 +x) +).
1) From the product (x?/2!)(x?/2!)(1)(1), where (x7/2!) is taken from each of the
first two factors (namely, [1 + x + (x?/2!)]) and 1 is taken from each of the last two
factors [namely, (1 +.x)]. Then (x7/2!)(x?/2)(1)(1) = x4/(2! 2!) = (41/(2! 2)-
(x*/4!), and the coefficient of x*/4! is 4!/(2! 2!) the number of ways one can
arrange the four letters E, E, N, N.
2) From the product (x?/2')(1)(x)(x), where (x?/2!) is taken from the first factor
(namely, [1 + x + (x*/2!)]), 1 is taken from the second factor (again, [I] + x +
(x?/2!)]), and x is taken from each of the last two factors [namely, (1 + x)]. Here
(x? /2!)(1)(x) (x) = x4/2! = (4!/2!)(x4/4)), so the coefficient of x*/4! is 41/2! — the
number of ways the four letters E, E, G, I can be arranged.
In the complete expansion of f (x), the term involving x‘ [and, consequently, x*/4!] is
a a
x4 x 4
mata tata
ta ta tay t*
-[(s) +G)+(@) +) (@)+G)*@) J):
where the coefficient of x*+/4! is the answer (102 arrangements) produced by the eight
results in the table.
Consider the Maclaurin series expansions of e* and e™*. xX
EXAMPLE 9.27
2 3 4 2 3 4
alexa 4424... e*=l-x+—-—4+—~-:::
2) 3! 2! 3! 4!
Adding these series together, we find that
x? x4
e+e -2(1+5 45+ ).
x —x _ _— _ tae
or
e+e — pax yh a
2 2! At ‘
438 Chapter 9 Generating Functions
Subtracting e* from e* yields
x —Xx _ x? x? 1
RH tate
These results now help us in the following.
A ship carries 48 flags, 12 each of the colors red, white, blue, and black. Twelve of these
EXAMPLE 9.28
flags are placed on a vertical pole in order to communicate a si gnal to other ships.
a) How many of these signals use an even number of blue flags and an odd number of
black flags?
The exponential generating function
x? x3 x? x4 x3 x?
foy= (14945454...) (+5+ 54.) (+5454)
considers all such signals made up of n flags, where n > 1. The last two factors in
f(x) restrict the signals to an even number of blue flags and an odd number of black
flags, respectively.
Since
f(x) = (ey? (=) (<—) 7 @ (e*)(e* — e7%) = ite —1)
_1f 4x _(1\ S (4x)!
“(> i! -)=(2)¥i=] it!
the coefficient of x!*/12! in f(x) yields (1/4)(4!2) = 4" signals made up of 12 flags
with an even number of blue flags and an odd number of black flags.
b) How many of the signals have at least three white flags or no white flags at all? In this
situation we use the exponential generating function
x2 x3 x3 x4 x2 x3 2
ca (leer be ye \(is beta) (eet ete)
2 2
= et G —~yY— 5) (e*)? — e* G ~x~ =) _ ett ~ xe** _ (5) x2 e3*
“EES = (4x)! = (3x)!
(S)(Ee) ? = (3x)!
=0
Here the factor (1 + x + x +.) =er%—xX- x in g(x) restricts the signals to
those that contain three or more of the 12 white flags, or none at all. The answer for
the number of signals sought here is the coefficient of x!7/12! in g(x). As we consider
each summand (involving an infinite summation), we find:
X (4x)! 1
i) > \ i” — Here we have the term “2- = 4!?(=),
x2
so the coefficient of x!2/12!
12!
i=0
is 4!?.
9.4 The Exponential Generating Function 439
ii) x (>: ( ) )—ow we see that in order to get x'7/12! we need to consider the
i=o |
term x[(3x)!/11!] = 3% @ P11) = (12)8")(x!?/121),
and here the coefficient
of x!7/12! is (12)(3"); and
iii) (x? /2) (> ( 0 —For this last summand we observe that
i=0 qi.
(x? /2)[(3x)!9/10!] = (1/2) (3!) !2/10!) = (1/2)(12)
11) 3!) (x!2/128),
where this time the coefficient of x!*/12! is (1/2)(12)(11) 3").
Consequently, the number of 12 flag signals with at least three white flags, or none at
all, is
4? _ 19!) — (1/2)(12)11)
3") = 10,754,218.
Our final example is reminiscent of past results.
A company hires 11 new employees, each of whom is to be assigned to one of four subdi-
EXAMPLE 9.29 . Le, .
visions. Each subdivision will get at least one new employee. In how many ways can these
assignments be made?
Calling the subdivisions A, B, C, and D, we can equivalently count the number of 11-
letter sequences in which there is at least one occurrence of each of the letters A, B, C, and
D. The exponential generating function for these arrangements is
f(x) =
x? a
7 xt ° = (e* — })
4 = 4
e'* —
3
4e’* +
2
6e* — 4e* 4+ 1.
The answer then is the coefficient of x!'/11! in f(x):
4 4
a! — 43!) + 62!) — 40") = D-D (7 -a",
i=0 t
This form of the answer should bring to mind some of the enumeration problems in Chap-
ter 5. Once the vocabulary is set aside, we are counting the number of onto functions
g: X -> Y where |X| = 11, |Y| = 4.
EXERCISES 9.4: a) F(x) = 3e"
b) f(x) = 6e* — 3e”
1. Find the exponential generating function for each of the c) ff) =e +x?
following sequences. d) f(x) =e —3x3 45x? 47x
a) 1, —1,1, —1, 1, —1,... e) f(x) = 1/d —x)
b) 1, 2, 27, 2°, 24... f) f(x) =3/U-2x)
+e
c) 1,-a,a*,—a*,a*,..., aeR 3. In each of the following, the function f(x) is the expo-
d) 1, a7, a*,a°,..., acR nential generating function for the sequence do, a), a2,...,
e) a,a3,a5,a’,..., aéeR whereas the function g(x) is the exponential generating func-
5 7 tion for the sequence by, b,, b2,.... Express g(x) in terms of
f) 0, 1, 2(2), 3(2°), 4(2”),... ;
f(x) if
2. Determine the sequence generated by each of the following a) b3 = 3
exponential generating functions. b, =a,,n EN, an #3
440 Chapter 9 Generating Functions
b) a, =5",nEN ii) MISSISSIPPI
b,
= —- iii) ISOMORPHISM
b, =a,,nEN,n
#3 b) For section (ii) of part (a), what is the exponential gener-
c) b, =2 ating function if the arrangement must contain at least two
by =4 Ps?
b, = 2a,,n€N,n#1,2 7. Say the company in Example 9.29 hires 25 new employ-
d) b; =2 ees. Give the exponential generating function for the number
by =4 of ways to assign these people to the four subdivisions so that
b, =8 each subdivision receives at least 3, but no more than 10, new
b, = 2a, +3,nEN,n #1, 2,3 people.
4. a) For the ship in Example 9.28, how many signals use at 8. Given the sequences dp, a), @,... and bo, bj, bo, ...,
least one flag of each color? (Solve this with an exponential with exponential generating functions f(x), g(x), respectively,
generating function.) show that if A(x) = f(x)g(x), then A(x) is the exponential
generating function of the sequence Cp, ¢€), C2, ..., wherec, =
b) Restate part (a) in an alternative way that uses the con-
ro (Jai ba+, for each n > 0.
cept of an onto function.
9. If a 20-digit ternary (0, 1, 2) sequence is randomly gener-
c) How many signals are there in Example 9.28, where the
total number of blue and black flags is even? ated, what is the probability that: (a) It has an even number of
1’s? (b) It has an even number of 1’s and an even number of
5. Find the exponential generating function for the sequence 2’s? (c) It has an odd number of 0’s? (d) The total number of 0’s
O!, 1! 2!, 3h... and 1’s is odd? (e) The total number of 0’s and |’s is even?
6. a) Find the exponential generating function for the number 10. How many 20-digit quaternary (0, 1, 2, 3) sequences are
of ways to arrange n letters, n > 0, selected from each of there where: (a) There is at least one 2 and an odd number of
the following words. 0’s? (b) No symbol occurs exactly twice? (c) No symbol occurs
i) HAWAII exactly three times? (d) There are exactly two 3’s or none at all?
9.5
The Summation Operator
This final section introduces a technique that helps us go from the (ordinary) generat-
ing function for the sequence ao, a1, a2, ... to the generating function for the sequence
a, 49 + 41,49 +a, +ay,....
For f(x) = ao + a,x + ax? + a3x? +--+, consider the function f(x)/(1 — x).
f(x) 1
Fay
l—x
LO) a = lao tax + ane? tage? te txt pe po]
= ay + (ao + a1)x + (ao + a1 + a2)x? + (ap +4) +2 + .a3)xX7F +-+-,
so f (x)/(1 — x) generates the sequence of sums ag, ay + a), Qo + A, +42, Ag +a, +a.+
a3, .... This is why we refer to 1/(1 — x) as the summation operator. Furthermore we see
that the sequence do, dp + 41, Q) +a; + 2,4 +a; +a. +43, ... 1s the convolution of
the sequence ap, a), a2, ... and the sequence bo, bj, bo, ..., where 6, = 1 foralln EN.
We find this technique handy in the following examples.
a) We know from part (b) of Example 9.5 that 1/(1 — x) is the generating function
EXAMPLE 9.30
for the sequence 1, 1, 1, . .. . Consequently, upon applying the summation operator,
1/(1 ~— x), we see that (1/(1 — x))(1/(1 — x)) is the generating function for the se-
quence 1,14+1,1+1+41,...—that is, 1/(1 — x)? is the generating function for
the sequence 1, 2, 3,..., as we found in part (c) of Example 9.5.
9.5 The Summation Operator 441
b) Now let us start with the polynomial x + x7, the generating function for the
sequence 0,1,1,0,0,0,.... Applying the summation operator, we have
(x +. x*)(1/(1 — x)) = (« + x”)/(1 — x), the generating function for the sequence
0,04+1,04+14+1,0+1+4+1+0,...,—that is, the sequence 0,1,2,2,....A
second application of the summation operator tells us that (x + x*)/(1 — x)* is the
generating function for the sequence 0,04+ 1,04+1+2,04+1+242,...,—
that is, the sequence 0, 1,3,5,.... A final application of the summation operator
tells us that (x +x*)/(1— x)? is the generating function for the sequence 0,
04+1,04143,04+1+4+3+5,...,—that is, the sequence 0, 1,4,9,.... This
suggests that, forn > 1, )0¢_,(2k — 1) =n”. To verify this suggestion, we look at
the coefficient of x” in (x +x?)/(1—xy =x(1 —x) 7% +x7(1 — x)7*. The coeffi-
cient of x”~! in (1 — x)~3 [which is the coefficient of x” in x(1 — x)7>] is
(2 Jeomt=can CFO 2 2 ert = (0) = set nen.
n~-1 n—l n—l 2
The coefficient of x”~? in (1 — x)~> [which is the coefficient of x” in x°(1 — x)77]
is { 2)(-h"? = (yr Peer? ~ ip = (7%) = 3(2)(n — 1). Conse-
quently, for n > 1, )°7_,(2k — 1) = the coefficient of x” in (x + x*)/(1 — x)? =
AG + 1)(n)+ S(n)(n —l)= (nf (n +1)+(n—1)] =n’, as we learned earlier
in Example 4.7, using the Principle of Mathematical Induction.
Our last example provides us with a method for deriving some of the summation formu-
las we encountered in earlier chapters.
Find a formula to express 07 + 17 + 27 4+---+n? asa function of n.
EXAMPLE 9.31 As in Section 9.2, we start with g(x) = 1/(1 —x) =14+x4+2x?+---.Then
sox/(1 — x)? is the generating function for 0, 1, 2, 3, 4, .... Repeating this technique, we
find that
*
d dg(x) x(1 +x) ay 42g?
= ON 2,2 ,4 32x
22.3 ee,
“Tx E ( dx ) (1-x) * * *
so x(1 + x)/(1 ~ x)? generates 0°, 17, 27, 3°, .... As a consequence of our earlier obser-
vations about the summation operator, we find that
x(+x) 1 | xQl+x)
(1—x)3 GQ —-x) (1 —x)*
is the generating function for 07,0? +1°,0?+12+427,0?4+17+4+27+437,....
Hence the coefficient of x” in [x(1 +.x)]/(1 — x)* is an i>. But the coefficient of x” in
[x(1 + x)]/(1 — x)‘ can also be calculated as follows:
_ —4 —4 —4
x(1+ x)
+( I Joos (FJevts-],
(1 ~—x)4 = (x +x*)(1 — x) t=649/()
442 Chapter 9 Generating Functions
so the coefficient of x” is
(,-,)
4
1) +(,"4)
_4yn-l —4
1)
_y\n-2
ye (' +(n-1)- Nene 1 cyr(4 - " -2)- eae
n—1 2
(FI (**: )- (n + 2)! (n+ 1)!
~ 3N(n —1)!
I
an n—2 3!(n — 2)!
[nm + 2)(n+ 1)(™) + (24+ I(r)— 1))
Al
|
a(n + 1)(2n +1)
(n)(n + 1)[@+2)4+ (xn -1)) = —————
Alm
{|
5. Let f(x) be the generating function for the sequence dp, a1,
EXERCISES 9.5 a), .... For what sequence is (1 — x) f (x) the generating func-
tion?
1. Find the generating function for the sequences (a) 1, 2, 3, 3,
6. Let f(x) = eo a,x' with f(1) = yoo a,, a finite num-
3,...3(b)
1, 2,3, 4,4,4,...;(¢) 1,4, 7, 10, 13,....
ber. Verify that the quotient [ f(x) — f(1)]/(x — 1) is the gen-
2. a) Find the generating function for the sequences (i) 0, 1, 0, erating function for the sequence sp, 5), 52,..., where s, =
0,0,...;@i)0, 1,1, 1, 1,...5 @ii) 0, 1, 2, 3,4, ...;
nt a,ne N.
(iv)0, 1, 3,6, 10,....
7. Find the generating function for the sequence a, @), a2, ...,
b) Use result (iv) from part (a) to find a formula for }°,_, k. where a, = ¥)"_)(1/i), 2 EN.
3. Continue the development of the ideas set forth in Example 8. a) Find the generating function for the sequence 0, 1, 3, 6,
9.31 and derive the formula }°"_, i° = [n(n + 1)/2P. 10, 15,... (where 1, 3,6, 10, 15,... are the triangular
4. If f(x) = 0, a,x”, what is the generating function for the numbers of Example 4.5).
sequence dy, dy + Gy, 4) +42, G2 + a3, ... ? Whatis the gener- b) For € Z*, determine a formula for the sum of the first
ating function for the sequence ag, ay + a, Go + A) + 2, a) + n triangular numbers.
a + a3, ay + a3 +4, ...? What is the generating function for
the sequence 4a7, 407+ ay9,97a 75ay 797,97
420 at ¢5a2 + 83Fo---? 9
9.6
Summary and Historical Review
In the early thirteenth century the Italian mathematician Leonardo of Pisa (c. 1175~1250),
in his Liber Abaci, introduced the European world to the Hindu-Arabic notation for nu-
merals and algorithms for arithmetic. In this text he also originated the study of the se-
quence 0, 1, 1, 2, 3, 5, 8, 13, 21, ... , which can be given recursively by Fy = 0, Fi = 1,
and Frys. = Fai) + Fy, n = 0. Since Leonardo was the son of Bonaccio, the sequence has
come to be called the Fibonacci numbers. (Filius Bonaccii is the Latin form for “son of
He) 9)
Bonaccio.’’)
If we consider the formula
1ff/ievs\) f1-v5\"
we find Fp = 0, F; = 1, Fo = 1, F3 = 2, Fy =3,.... Yes, this formula determines each
Fibonacci number as a function of n. (Here we have the solution for the recursive Fibonacci
relation. We shall learn more about this in the next chapter.) This formula was not derived,
9.6 Summary and Historical Review 443
however, until 1718, when Abraham DeMoivre (1667-1754) obtained the result from the
generating function
x 1 1 1
[O= Te (4). 1-(4),
2 2
Extending the existing techniques of the generating function, Leonhard Euler (1707-
1783) advanced the study of the partitions of integers in his 1748 two-volume opus, /ntro-
ductio in Analysin Infinitorum. With
1 1 1
=]
foe)
1
P(x) =
l—-x1l—x?71-—x3 i=1 1—x'’
we have the generating function for p(0), p(1), p(2),..., where p(n) is the number of
partitions of n into positive summands and p(0) is defined to be 1.
Leonhard Euler (1707-1783)
In the latter part of the eighteenth century, further developments on generating functions
arose in conjunction with ideas in probability theory, especially with what is now called the
“moment generating function.” These related notions were presented in their first complete
treatment by the great scholar Pierre-Simon de Laplace (1749-1827) in his 1812 publication
Théorie Analytique des Probabilités.
Finally, we mention Norman Macleod Ferrers (1829-1903), after whom the diagram we
called the Ferrers graph is named.
For us the study of the ordinary and exponential generating functions provided a powerful
technique that unified ideas found in Chapters I, 5, and 8. Extending our prior experience
with polynomials to power series, and extending the binomial theorem to (1 + x)” for
the cases where n need not be positive or even an integer, we found the necessary tools
to compute the coefficients in these generating functions. This was more than worth the
effort because the algebraic calculations we performed took into account all of the selection
444 Chapter 9 Generating Functions
processes we were trying to consider. We also found that we had seen some generating
functions in a prior chapter and saw how they arose in the study of partitions.
The concept of a partition of a positive integer now enables us to complete the summaries
of our earlier discussions on distributions, as given in Tables 1.11 and 5.13. Here we can now
deal with the distributions of m objects into n (< m) containers for the cases where neither
the objects nor the containers are distinct. These are covered by the entries in the second
and fourth rows of Table 9.5. The notation p(m, 2), which appears in the last column for
these entries, is used to denote the number of partitions of the positive integer m into exactly
n (positive) summands. (This idea will be examined further in Supplementary Exercise 3
of the next chapter.) The types of distributions in the first and third rows of this table were
also listed in Table 5.13. We include them here a second time for the sake of comparison
and completeness.
Table 9.5
Objects Are | Containers Are | Some Container(s) Number of
Distinct Distinct May Be Empty Distributions
No Yes Yes ("tm 1)
No No Yes (1) p(m), for n = m
(2) p(m, 1) + pm, 2) +---4+
p(m,n),forn <m
No Yes No (eee) =a = GI)
No No No p(m,n)
For comparable coverage of the material presented in this chapter, the interested reader
should consult Chapter 2 of C. L. Liu [3] and Chapter 6 of A. Tucker [8]. The text by
J. Riordan [6] has extensive coverage of ordinary and exponential generating functions. An
interesting survey article on generating functions, written by Richard P. Stanley, can be found
in the text edited by G-C. Rota [7]. The text by H. S. Wilf [9] deals with generating functions
and some of the ways they are applied in discrete mathematics. This work also demonstrates
how these functions provide a bridge between discrete mathematics and continuous analysis
(in particular, the theory of functions of a complex variable).
The reader interested in learning more about the theory of partitions should consult
Chapter 10 of I. Niven, H. Zuckerman, and H. Montgomery [5].
Finally, a great deal about the moment generating function and its use in probability
theory can be found in Chapter 3 of H. J. Larson [2] and in Chapter XI of the comprehensive
work by W. Feller [1].
REFERENCES
1. Feller, William. An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed. New
York: Wiley, 1968.
2. Larson, Harold J. introduction to Probability Theory and Statistical Inference, 2nd ed. New
York: Wiley, 1969.
3. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
4. Neal, David. “The Series }°°., nx” and a Pascal-like Triangle.” The College Mathematics
Journal 25, No. 2 (March 1994): pp. 99-101.
Supplementary Exercises 445
5. Niven, Ivan, Zuckerman, Herbert, and Montgomery, Hugh. An Introduction to the Theory of
Numbers, 5th ed. New York: Wiley, 1991.
6. Riordan, John. An Introduction to Combinatorial Analysis. Princeton, N.J.: Princeton University
Press, 1980. (Originally published in 1958 by John Wiley & Sons.)
7. Rota, Gian-Carlo, ed. Studies in Combinatorics, Studies in Mathematics, Vol. 17. Washington,
D.C.: The Mathematical Association of America, 1978.
8. Tucker, Alan. Applied Combinatorics, 4th ed. New York: Wiley, 2002.
9. Wilf, Herbert S. Generatingfunctionology, 2nd ed. San Diego, Calif.: Academic Press, 1994.
9. Simplify the following sum where n € Z*: (7) + 2(3) +
SUPPLEMENTARY EXERCISES 3(2) Hee + n(®). (Hint: You may wish to start with the bi-
nomial theorem.)
10. Determine the generating function for the number of par-
1. Find the generating function for each of the following
titions of n € N where 1 occurs at most once, 2 occurs at most
sequences.
twice, 3 at most thrice, and, in general, k occurs at most k times,
a) 7,8,9,10,... for every k € Z*.
b) 1,a,a7,a°,a*,..., aeéeR
11. In arural area 12 mailboxes are located at a general store.
ce) 1,(Qi+a), +a), +a)y,..., aeR
a) If a newscarrier has 20 identical fliers, in how many
d)2,1t+a,1+a’?,i+a?,..., aeéeR ways can she distribute the fliers so that each mailbox gets
2. Find the coefficient of x®? in at least one flier?
fx) = OO x8 4 xl ge xl 4 xy, b) If the mailboxes are in two rows of six each, what is
the probability that a distribution from part (a) will have 10
3. Sergeant Bueti must distribute 40 bullets (20 for rifles and fliers distributed to the top six boxes and 10 to the bottom
20 for handguns) among four police officers so that each officer Six?
gets at least two, but no more than seven, bullets of each type.
12. Let S be a set containing n distinct objects. Verify that
In how many ways can he do this?
e* /(1 — x)* is the exponential generating function for the num-
4. Find a generating function for the number of ways to parti- ber of ways to choose m of the objects in S, forO < m <n, and
tion a positive integer n into positive-integer summands, where distribute these objects among & distinct containers, with the
each summand appears an odd number of times or not at all. order of the objects in any container relevant for the distri-
5. For n € Z*, show that the number of partitions of 7 in bution.
which no even summand is repeated (an odd summand may or 13. a) For a, d ER, find the generating function for the se-
may not be repeated) is the same as the number of partitions of quence a,a+d,a+2d,a+3d,....
n where no summand occurs more than three times.
b) Forn € Z*, use the result from part (a) to find a formula
6. How many 10-digit telephone numbers use only the digits for the sum of the first n terms of the arithmetic progression
1, 3, 5 and 7, with each digit appearing at least twice or not at a4,a+d,a+2d,at+3d,....
all?
14. a) For the alphabet © = {0, 1}, let a, count the number
7, a) For what sequence of numbers is g(x) = (1 — 2x) §/2 of strings of length n in &*—that is, for n EN, a, =
the exponential generating function? |"|. Determine the generating function for the sequence
b) Find a and B so that (1 — ax)° is the exponential gener- do, 4], 42,....
ating function for the sequence 1,7,7-11,7-11-15,.... b) Answer the question posed in part (a) when |%| = k,
8. For integers n, k > 0 let a fixed positive integer.
15. Let f(x) = dp + a,x + ax? +a3x°+..., the generating
e P, be the number of partitions of n.
function for the sequence do, a, G2, a3,.... Now letn € Z*,
e P, be the number of partitions of 2n +, where n +k is n fixed.
the greatest summand.
a) Find the generating function for the sequence 0, 0, 0,
e P; be the number of partitions of 2n + k into precisely ... 0, Gg, @), G2, 43, ..., where there are n leading zeros.
n+k summands. b) Find the generating function for the sequence a,,, d,41,
Using the concept of the Ferrers graph, prove that P; = P,
and P; = P3, thus concluding that the number of partitions of 16. Suppose that X is a discrete random variable with proba-
2n + k into precisely n + k summands is the same for all k. bility distribution given by
446 Chapter 9 Generating Functions
Pr(X =x) = k(4)", x =0,1,2,3,... for the first mile, two miles per hour for the second mile, four
0, otherwise, miles per hour for the third mile, ..., and 2”~' miles per hour
where k is a constant. Determine (a) the value of &; for the nth mile.
(b) Pr (X = 3), Pr (X <3), Pr(X > 3), Pr (X > 2); and a) Whatis the car’s average velocity for the first four miles?
(c) Pr (X > 4|X > 2), Pr (X > 104|X > 102). b) Fora given value of n, what is the car’s average velocity
17. Suppose that Y is a geometric random variable where the for the first 2 miles?
probability of success for each Bernoulli trial is p. If m,n € Z* c) Find the smallest value of n for which the car’s average
with m > n, determine Pr (Y > m|Y¥ > nv). velocity for the first n miles exceeds 10 miles per hour.
18. Atest car is driven a fixed distance ofn miles along a straight
highway. (Here n € Z*.) The car travels at one mile per hour
10
Recurrence
Relations
n earlier sections of the text we saw some recursive definitions and constructions. In
Definitions 5.19, 6.7, 6.12, and 7.9, we obtained concepts at level n + 1 (or of sizen + 1)
from comparable concepts at level n (or of size n), after establishing the concept at a
first value of n, such as 0 or 1. When we dealt with the Fibonacci and Lucas numbers in
Section 4.2, the results at level n + 1 turned out to depend on those at levels n and n — 1,
and for each of these sequences of integers the basis consisted of the first two integers
(of the sequence). Now we shall find ourselves in a somewhat similar situation. We shall
investigate functions a(n), preferably written as a, (for n > 0), where a, depends on some
of the prior terms G,_1, Gy—2, ..., @|, 4g. This study of what are called either recurrence
relations or difference equations is the discrete counterpart to ideas applied in ordinary
differential equations.
Our development will not employ any ideas from differential equations but will start
with the notion of a geometric progression. As further ideas are developed, we shall see
some of the many applications that make this topic so important.
10.1
The First-Order Linear
Recurrence Relation
A geometric progression is an infinite sequence of numbers, such as 5, 15, 45, 135,...,
where the division of each term, other than the first, by its immediate predecessor is a
constant, called the common ratio. For our sequence this common ratio is 3: 15 = 3(5), 45 =
3(15), and so on. If ay, a), a2, .. . iS a geometric progression, then a) /ay = a2/a, =-+:-: =
Gn+1/An = ++: =r, the common ratio. In our particular geometric progression we have
An41 = 3ay,,n = 0.
The recurrence relation dy4, = 3da,, n > 0, does not define a unique geometric progres-
sion. The sequence 7, 21, 63, 189, ... also satisfies the relation. To pinpoint a particular
sequence described by a,,; = 3a,, we need to know one of the terms of that sequence.
Hence
Anti = 3an, n>=0, ay = 5,
uniquely defines the sequence 5, 15, 45, ..., whereas
On+1 = 34&n, HW = 0, a, = 21,
identifies 7, 21, 63, ... as the geometric progression under study.
447
448 Chapter 10 Recurrence Relations
The equation a,4| = 3a,, n > 0 is a recurrence relation because the value of a,,+, (the
present consideration) is dependent on a,, (a prior consideration). Since a,,, depends only
on its immediate predecessor, the relation is said to be of first order. In particular, this
is a first-order linear homogeneous recurrence relation with constant coefficients. (We'll
say more about these ideas later.) The general form of such an equation can be written
Ani) = day, n > 0, where d is a constant.
Values such as a or a1, given in addition to the recurrence relations, are called boundary
conditions. The expression a) = A, where A is a constant, is also referred to as an initial
condition. Our examples show the importance of the boundary condition in determining the
unique solution.
Let us return now to the recurrence relation
Any. = 3an, n>O, ay = 5.
The first four terms of this sequence are
ay =5,
a, = 3a9 = 3(5),
ay = 3a, = 3(3a9) = 3°(5), and
a3 = 3a) = 3(37(5)) = 33(5).
These results suggest that for each n > 0, a, = 5(3”). This is the unique solution of the
given recurrence relation. In this solution, the value of a, is a function of m and there is no
longer any dependence on prior terms of the sequence, once we define aj. To compute ajo,
for example, we simply calculate 5(3'°) = 295,245; there is no need to start at a9 and build
up to ao in order to obtain ajo.
From this example we are directed to the following. (This result can be established by
the Principle of Mathematical Induction.)
The unique solution of the recurrence relation
Gn4t = ddan, wheren >0, disaconstant, and ao = A,
is given by
dy, = Ad”, A> Od.
Thus the solution a,, = Ad", n > 0, defines a discrete function whose domain is the set
N of all nonnegative integers.
Solve the recurrence relation a, = 7a,_,, where n > | and a2 = 98.
EXAMPLE 10.1
This is just an alternative form of the relation a,,, = 7a, for n > 0 and a2 = 98. Hence
the solution has the form a, = a9(7"). Since a2 = 98 = ao(7°), it follows that ay = 2, and
an = 2(7"), n = O, is the unique solution.
A bank pays 6% (annual) interest on savings, compounding the interest monthly. If Bonnie
EXAMPLE 10.2
deposits $1000 on the first day of May, how much will this deposit be worth a year later?
The annual interest rate is 6%, so the monthly rate is 6%/12 = 0.5% = 0.005. For
O<n < 12, let p, denote the value of Bonnie’s deposit at the end of n months. Then
Pa+t = Pn + 9.005 p,, where 0.005 p, is the interest earned on p, during month n + 1,
forO <n < 11, and pp = $1000.
10.1. The First-Order Linear Recurrence Relation 449
The relation pry = (1.005) p», po = $1000, has the solution p, = py(1.005)" =
$1000(1.005)". Consequently, at the end of one year, Bonnie’s deposit is worth
$1000(1.005)'*
= $1061.68.
In the next example we find a fifth way to count the number of compositions of a positive
integer. The reader may recall that this situation was examined earlier in Examples 1.37,
3.11, 4.12, and 9.12.
Figure 10.1 provides the compositions of 3 and 4. Here we see that compositions (1’)-(4’)
EXAMPLE 10.3
of 4 arise from the corresponding compositions of 3 by increasing the last summand (in each
corresponding composition of 3) by 1. The other four compositions of 4, namely, (1”)-(4”),
are obtained from the compositions of 3 by appending “+1” to each of the corresponding
compositions of 3. (The reader may recall seeing such results in Fig. 4.7.)
(1’) 4
(2’) 143
(1) 3 (3’) 2+2
(2) 14+2 (4’) 1+14+2
(3) 2+1
(4) 14+141 | (1% 341
(2”) 1+2+1
(3”) 2+14+1
(4”) 1+1+1+1
Figure 10.1
What happens in Fig. 10.1 exemplifies the general situation. So if we let a, count the
number of compositions of n, for n € Z*, we find that
An+1 = 24, n>, a, = 1.
However, in order to apply the formula for the unique solution (where n > 0) to this recur-
rence relation, we let b, = a,41. Then we have
Ba+1 = 2b, n>0, bo = 1,
so b, = bo(2") = 2", anda, = b,-; = 2"-"',n > 1.
The recurrence relation a,4, — da, = 0 is called linear because each subscripted term
appears to the first power (as do the variables x and y in the equation of a line in the plane). In
a linear relation there are no products such as a,@,—, which appears in the nonlinear recur-
rence relation @,41 — 3a,@,—-, = 0. However, there are times when a nonlinear recurrence
relation can be transformed into a linear one by a suitable algebraic substitution.
Find a) if ar - 5a’, where a, > 0 forn > 0, and ao = 2.
EXAMPLE 10.4
Although this recurrence relation is not linear in ay, if we let b, = a?, then the new
relation b,,; = 5b, forn > 0, and bo = 4, is a linear relation whose solution is b, = 4-5".
Therefore, a, = 2(/5)" forn > 0, and a,) = 2(/5)!? = 31,250.
450 Chapter 10 Recurrence Relations
The general first-order linear recurrence relation with constant coefficients has the form
Qn+| + Ca, = f(n), n > 0, where c is a constant and f(m) is a function on the set N of
nonnegative integers.
When f(z) = 0 for alln €N, the relation is called homogeneous; otherwise it is called
nonhomogeneous. So far we have only dealt with homogeneous relations. Now we shall
solve a nonhomogeneous relation. We shall develop specific techniques that work for all
linear homogeneous recurrence relations with constant coefficients. However, many differ-
ent techniques prove useful when we deal with a nonhomogeneous problem, although none
allows us to solve everything that can arise.
Perhaps the most popular, though not the most efficient, method of sorting numeric data
EXAMPLE 10.5
is a technique called the bubble sort. Here the input is a positive integer n and an array
X), X2,X3,..., Xp, of real numbers that are to be sorted into ascending order.
The pseudocode procedure in Fig. 10.2 provides an implementation for an algorithm to
carry out this sorting process. Here the integer variable 7 is the counter for the outer for
loop, whereas the integer variable j is the counter for the inner for loop. Finally, the real
variable temp is used for storage that is needed when an exchange takes place.
procedure BubbleSort(n: positive integer; xX),X2,X3,...,X,:
real numbers)
begin
fori:=1ton-—i1do
forj :=ndowntoi+1do
if x, < x,_; then
begin {interchange}
Cemp := Xy-)
X),-] t= X,
xX, := temp
end
end
Figure 10.2
We compare the last entry, x,, in the given array with its immediate predecessor, x, _1. If
Xn < Xn—1, we interchange the values stored in x,—-; and x,. In any event we will now have
Xn-1 <X,. Then we compare x,_; with its immediate predecessor, x,~2. If X,~1 < Xp_2,
we interchange them. We continue the process. After n — | such comparisons, the smallest
number in the list is stored in x;. We then repeat this process for the n — 1 numbers now
stored in the (smaller) array x2, x3, ..., Xn. Inthis way, each time (counted by /) this process
is carried out, the smallest number in the remaining sublist “bubbles up” to the front of that
sublist.
Asmall example wherein x = 5 and x, = 7, x2 = 9,.x3 = 2,x4 = 5, and x5 = 8 is given
in Fig. 10.3 to show how the bubble sort of Fig. 10.2 places a given sequence in ascending
order. In this figure each comparison that leads to an interchange is denoted by the symbol
2; the symbol } indicates a comparison that results in no interchange.
To determine the time-complexity function h(n) when this algorithm is used on an input
(array) of size n > 1, we count the total number of comparisons made in order to sort the n
given numbers into ascending order.
If a, denotes the number of comparisons needed to sort n numbers in this way, then we
get the following recurrence relation:
Qn = Gn-1 + (n — 1), n> 2, a; = 0.
10.1 The First-Order Linear Recurrence Relation 451
i=1]| x, 7 7 7 7 2
Xp 9 9 5) 3 2 )i=2 7
Xa 5]. 5 5
j=4 5 5 5
J=
Xs 8 8 8 8 8
Four comparisons and two interchanges.
x 7 7 7 5
X3 9 9 y=4 3/3 7
X4 ‘ 5 9 9
j=5
Xs 8 8 8 8
Three comparisons and two interchanges.
X2 5 5 5
x 7 7 7
,4 9 i =5 i “4 8
Xs 8 9 9
Two comparisons and one interchange.
X2 5
X3 7
X4 8 hi _s
Xs 9
One comparison but no interchanges.
Figure 10.3
This arises as follows. Given a list of numbers, we make n — 1 comparisons to bubble
the smallest number up to the start of the list. The remaining sublist of n — 1 numbers then
requires a, —; Comparisons in order to be completely sorted.
This relation is a linear first-order relation with constant coefficients, but the term n ~— 1
makes it nonhomogeneous. Since we have no technique for attacking such a relation, let us
list some terms and see whether there is a recognizable pattern.
a, =0
a2=a,+(2—1)=1
a3 agt+(3-—1)=142
a4=a,+ (4-1 =14+24+3
In general, ay = 1+2+---+(n—1)
=[(n — 1)n]/2 = (nr? — n)/2.
452 Chapter 10 Recurrence Relations
As a result, the bubble sort determines the time-complexity function h: Z* + R given
by h(n) = ay, = (n? — n)/2. [Here h(Z*) CN_] Consequently, as a measure of the running
time for the algorithm, we write h € O(n’). Hence the bubble sort is said to require O(n’)
comparisons.
EXAMPLE 10.6 In part (c) of Example 9.6 we sought the generating function for the sequence 0, 2, 6, 12,
: 20, 30, 42, ..., and the solution rested upon our ability to recognize that a, =n? +n for
each n € N. If we fail to see this, perhaps we can examine the given sequence and determine
whether there is some other pattern that will help us.
Here ag = 0, a, = 2, aa = 6, a3 12, aq = 20, a5 = 30, a> 42, and
aj —ayp = 2 a4—-a, =6 as — a4, = 10
a) —a, =4 a4 —a3 =8 ag — as = 12.
These calculations suggest the recurrence relation
Qn
— An, = 2h, n> 1, ag
= 0.
To solve this relation, we proceed in a slightly different manner from the method we used
in Example 10.5. Consider the following n equations:
Qy —a =2
a2 —a,=4
a3 —-a,=6
An — An-| = 2n.
When we add these equations, the sum for the left-hand side will contain a; and —a; for all
1<i<n-—1.So we obtain
Q,—- a9 =24+44+64+---+2n =2114+24+3+4+---+n)
= 2[n(n
+ 1)/2] =n? +n.
Since ay = 0, it follows that a, = n* +n for all n EN, as we found earlier in part (c) of
Example 9.6.
At this point we shall examine a recurrence relation with a variable coefficient.
Solve the relation ad, = n+ d,_;, where n > 1 and ap = 1.
EXAMPLE 10.7 Writing the first five terms defined by the relation, we have
ay = 1 a2 =2-a,=2:-1 a4=4-a,=4-3-2-]
aq, =1-a=1 43 =3-a,=3:-2-1
Therefore, a, = n! and the solution is the discrete function a,,, which counts the number
of permutations of n objects, n > 0.
10.1 The First-Order Linear Recurrence Relation 453
While on the subject of permutations, we shall examine a recursive algorithm for gen-
erating the permutations of {1, 2,3,...,— 1, n} from those for {1, 2,3,...,n—1}."
There is only one permutation of {1}. Examining the permutations of {1, 2},
1 2
2 1
we see that after writing the permutation | twice, we intertwine the number 2 about | to get
the permutations listed. Writing each of these two permutations three times, we intertwine
the number 3 and obtain
I 2 3
1 3 2
3 1 2
3 2 ]
2 3 1
2 ] 3
We see here that the first permutation is 123 and that we obtain each of the next two
permutations from its immediate predecessor by interchanging two numbers: 3 and the
integer to its left. When 3 reaches the left side of the permutation, we examine the remaining
numbers and permute them according to the list of permutations we generated for {1, 2}.
(This makes the procedure recursive.) After that we interchange 3 with the integer on its
right until 3 is on the right side of the permutation. We note that if we interchange 1 and 2
in the last permutation, we get 123, the first permutation listed.
Continuing for S$ = {1, 2, 3, 4}, we first list each of the six permutations of {1, 2, 3} four
times. Starting with the permutation 1234, we intertwine the 4 throughout the remaining
23 permutations as indicated in Table 10.1 (on page 454). The only new idea here develops
as follows. When progressing from permutation (5) to (6) to (7) to (8), we interchange 4
with the integer to its right. At permutation (8), where 4 has reached the right side, we
obtain permutation (9) by keeping the location of 4 fixed and replacing the permutation
132 by 312 from the list of permutations of {1, 2, 3}. After that we continue as for the first
eight permutations until we reach permutation (16), where 4 is again on the right. We then
permute 321 to obtain 231 and continue intertwining 4 until all 24 permutations have been
generated. Once again, if 1 and 2 are interchanged in the last permutation, we obtain the
first permutation in our list.
The chapter references provide more information on recursive procedures for generating
permutations and combinations.
We shall close this first section by returning to an earlier idea
— the greatest common
divisor of two positive integers.
Recursive methods are fundamental in the areas of discrete mathematics and the analysis
i EXAMPLE 10.8 of algorithms. Such methods arise when we want to solve a given problem by breaking it
down, or referring it, to smaller similar problems. In many programming languages this can
be implemented by the use of recursive functions and procedures, which are permitted to
invoke themselves. This example will provide one such procedure.
"The material from here to the end of this section is a digression that uses the idea of recursion. It does not
deal with methods for solving recurrence relations and may be omitted with no loss of continuity.
454 Chapter 10 Recurrence Relations
Table 10.1
(1)
OW
LD
(2)
W
WNNN
(3)
NNN NY WW
(4)
ee
() 4
(6)
NNN
WWW
(7)
ee
(8)
(9)
UD
(10)
ee
OO
(11)
DO
(15)
Fe
-wWNL
(16)
op
(17)
—
:
(22)
bh
bt pt
www
(23)
a
(24)
In computing gcd(333, 84) we obtain the following calculations when we use the Ev-
clidean algorithm (presented in Section 4.4).
333 = 3(84) + 81 0< 81 < 84 (1)
84 = 1(81) +3 0<3<8l 'e)
81 = 27(3) + 0. (3)
Since 3 is the last nonzero remainder, the Euclidean algorithm tells us that
gcd(333, 84) = 3. However, if we use only the calculations in Eqs. (2) and (3), then we find
that gcd(84, 81) = 3. And Eg. (3) alone implies that gcd(81, 3) = 3 because 3 divides 81.
Consequently,
gcd(333, 84) = ged(84, 81) = ged(81, 3) = 3,
where the integers involved in the successive calculations get smaller as we go from Eq. (1)
to Eq. (2) to Eq. (3).
We also observe that
81 = 333 mod 84 and 3 = 84 mod 81.
Therefore it follows that
gcd(333, 84) = gced(84, 333 mod 84) = gcd(333 mod 84, 84 mod (333 mod 84)).
These results suggest the following recursive method for computing gcd(a, b), where
a,beZ.
Say we have the input a, b € Z*.
Step 1: If b|a (or a mod 5 = 0), then ged(a, b) = b.
Step 2: If b } a, then perform the following tasks in the order specified.
i) Seta = b.
10.1 The First-Order Linear Recurrence Relation 455
ii) Set b = a mod b, where the value of a for this assignment is the old value
of a.
iii) Return to step (1).
These ideas are used in the pseudocode procedure in Fig. 10.4. (The reader may wish to
compare this procedure with the one given in Fig. 4.11.)
procedure gcd2 (a, b: positive integers)
begin
if amodb = 0 then
gcd=b
else gcd = gcd2(b, amod b)
end
Figure 10.4
8. For the implementation of the bubble sort given in Fig. 10.2,
the outer for loop is executed n — | times. This occurs regard-
less of whether any interchanges take place during the exe-
1. Find a recurrence relation, with initial condition, that
cution of the inner for loop. Consequently, for i = k, where
uniquely determines each of the following geometric progres-
1<k <n — 2, if the execution of the inner for loop results
sions. in no interchanges, then the list is in ascending order. So
a) 2, 10, 50, 250,... the execution of the outer for loop fork + 1 <i <n — 1 isnot
b) 6, —18, 54, —162,... needed.
c) 7, 14/5, 28/25, 56/125,... a) For the situation described here, how many unnecessary
2. Find the unique solution for each of the following recur- comparisons are made if the execution of the inner for loop
rence relations. fori =k (1 <k <n — 2) results in no interchanges?
a) Gna, ~— 1.54, = 0, n>O b) Write an improved version of the bubble sort shown in
b) 4a, — Sa,_; =0, n> 1 Fig. 10.2. (Your result should eliminate the unnecessary
comparisons discussed at the start of this exercise.)
C) 34,4; — 4a, =0, n=O, a, =5
c) Using the number of comparisons as a measure of
d) 2a, —3a,-) =0, n>1, ag = 81
its running time, determine the best-case and the worst-
3. If a,, n > 0, is the unique solution of the recurrence rela- case time complexities for the algorithm implemented in
tion a,4; — da, = 0, and a3 = 153/49, as = 1377/2401, what part (b).
is d?
4. The number of bacteria in a culture is 1000 (approximately), 9. Say the permutations of {1, 2, 3,4, 5} are generated by
and this number increases 250% every two hours. Use a recur- the procedure developed after Example 10.7. (a) What is the
rence relation to determine the number of bacteria present after last permutation in the list? (b) What two permutations precede
one day. 25134? (c) What three permutations follow 25134?
5. If Laura invests $100 at 6% interest compounded quarterly,
10. Fora > 1,apermutation p,, po, p3,.... Pr, of the integers
how many months must she wait for her money to double? (She
1,2,3,...,” is called orderly if, for each i = 1, 2,3,...,
cannot withdraw the money before the quarter is up.)
n— 1, there exists a j > i suchthat|p, — p,| = 1. [Ifa = 2, the
6. Paul invested the stock profits he received 15 years ago in permutations 1, 2 and 2, | are both orderly. When = 3 we find
an account that paid 8% interest compounded quarterly. If his that 3, 1, 2 is an orderly permutation, while 2, 3, 1 is not. (Why
account now has $7218.27 in it, what was his initial investment? not?)] (a) List all the orderly permutations for 1, 2, 3. (b) List all
7. Let x}, %2,..., X29 be a list of distinct real numbers to be the orderly permutations for 1, 2, 3, 4. (c) If pi, p2, p3, Pa, Ps
sorted by the bubble-sort technique of Example 10.5. (a) After is an orderly permutation of 1, 2, 3, 4, 5, what value(s) can p;
how many comparisons will the 10 smallest numbers of the orig- be? (d) For n > 1, let a, count the number of orderly permu-
inal list be arranged in ascending order? (b) How many more tations for 1, 2, 3,..., a. Find and solve a recurrence relation
comparisons are needed to finish this sorting job? for a,.
456 Chapter 10 Recurrence Relations
10.2
The Second-Order Linear
Homogeneous Recurrence
Relation with Constant Coefficients
Let k € Z* and Co (4 0), Cy, Co, ..., Cx (4 0) be real numbers. If a,, for n > 0, is a
discrete function, then
Coan + Cray) + Codn—2 ++ ++ + Cpdn-z = f(r), n>k,
is a linear recurrence relation (with constant coefficients) of order k. When f (n) = 0 for
all n > 0, the relation 1s called homogeneous; otherwise, it is called nonhomogeneous.
In this section we shall concentrate on the homogeneous relation of order two:
Codn + Cidn—1 + Cran-2 = 0, n> 2.
On the basis of our work in Section 10.1, we seek a solution of the form a, = cr”, where
ec #OQOandr £0.
Substituting a, = cr” into Cody, + Ciay-1 + Cran—2 = 0, we obtain
Coer" + Cyer™! + Cer”? = 0.
With c, r # 0, this becomes Cor? + Cir + C2 = 0, a quadratic equation which is called
the characteristic equation. The roots r,, r2 of this equation determine the following three
cases: (a) rj, 2 are distinct real numbers; (b) 7;, 72 form a complex conjugate pair; or
(c) r], ro are real, but r; = ro. In all cases, r; and rz are called the characteristic roots.
Case (A): (Distinct Real Roots)
Solve the recurrence relation a, + @,_) — 6a,-2 = 0, where n > 2 and ay = —1, a, = 8.
EXAMPLE 10.9
If a, = cr" with c, r # 0, we obtain cr” + cr”! — 6cr"~? = 0 from which the charac-
teristic equation r? +r — 6 = 0 follows:
0=r+r-6=(r+3)(r—2)>r
=2, 3.
Since we have two distinct real roots, a, = 2” and a, = (--3)" are both solutions [as are
5(2") and d(—3)", for arbitrary constants b, d]. They are linearly independent solutions
because one is not a multiple of the other; that is, there is no real constant k such that
(—3)" = k(2") for all n EN." We write a, = c(2”) + c2(—3)" for the general solution,
where c), c2 are arbitrary constants.
With ay = ~1 and a, = 8, c, and c2 are determined as follows:
—1 = ay = €1(2°) + €(-3)9 = 1 +p
8 = ay = c1(2!) + er(-3)! = 2c; — 3e2.
Solving this system of equations, one finds cy = 1, cp = —2. Therefore, a, = 2” — 2(—3)",
n > 0, is the unique solution of the given recurrence relation.
The reader should realize that to determine the unique solution of a second-order linear
homogeneous recurrence relation with constant coefficients one needs two initial conditions
*We can also call the solutions Gy, = 2” and a, = (—3)" linearly independent when the following condition
is satisfied: For k;, kp € R, if k)(2”) + ko(—3)" = 0 for all n EN, then ky = ko = 0.
10.2 The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients 457
(values) — that is, the value of a, for two values of n, very oftenn = Oandn = l,orn =1
and n = 2.
An interesting second-order homogeneous recurrence relation is the Fibonacci relation.
(This was mentioned earlier in Sections 4.2 and 9.6.)
| EXAMPLE 10.10 Solve the recurrence relation Fy,12 = Fy; + Fy, wheren
As in the previous example, let F, = cr”, forc, r # 0, n > 0. Upon substitution we get
> Oand Fy = 0, F) = 1.
ae
cr"+? = er"+! 4 cr". This gives the characteristic equation r* — r — 1 = 0. The character-
istic roots are r = (1+ /5)/2, so the general solution is
14/5\" 1— /5\"
To solve for cj, cz, we use the given initial values and write 0 = Fo =c; +¢2, 1 =
Fi = c[(1 + V5)/2]
+ eof — V5)/2]. Since —c) = c2, we have 2 = ¢(1+ V5) —
c1(1 — 5) and c, = 1/+/5. The general solution is given by
1[fievs\o fi-vs\"
real) (5) ] me
When dealing with the Fibonacci numbers one often finds the assignments aw = (1 + J/5)/2
and B = (1 — V/5)/2, where @ is known as the golden ratio. As a result, we find that
F, n =
(a” — B")
a —_— —
oe 5 n>0.
V5 a—Bp
[This representation is referred to as the Binet form for F,,, as it was first published in 1843
by Jacques Philippe Marie Binet (1786-1856). ]
For n > 0, let S = {1, 2,3,...,} (when n = 0, S = @), and let a, denote the number
EXAMPLE 10.11
of subsets of S that contain no consecutive integers. Find and solve a recurrence relation
for ap.
For 0<n <4, we have ap = 1, a; = 2, ay = 3, a3 = 5, and aq = 8. [For example,
a3; = 5 because S = {1, 2, 3} has J, {1}, {2}, {3}, and {1, 3} as subsets with no consecutive
integers (and no other such subsets).] These first five terms are reminiscent of the Fibonacci
sequence. But do things change as we continue?
Let n > 2 and S = {1,2,3,...,n~—2,n—1,n}. If ACS and A is to be counted in
dy, there are two possibilities:
a) n € A: When this happens (n — 1) ¢ A, and A — {n} would be counted in a,_2.
b) n ¢ A: For this case A would be counted in a,_1.
These two cases are exhaustive and mutually disjoint, so we conclude that a, = a,_; +
An—2, where n > 2 and ap = |, a; = 2, is the recurrence relation for the problem. Now we
could solve for a,, but if we notice that a, = F,42, > 0, then the result of Example 10.10
implies that
1} (/14V5\" [1 -5\""
n+2 n+2
an - = — 3 n>.
J5 2 2
458 Chapter 10 Recurrence Relations
Suppose we have a2 X n chessboard, forn € Z*. The case forn = 4 is shown in part (a) of
EXAMPLE 10.12
Fig. 10.5. We wish to cover such a chessboard using 2 X 1 (vertical) dominoes, which can
also be used as 1 X 2 (horizontal) dominoes. Such dominoes (or tiles) are shown in part (b)
of Fig. 10.5.
(a) (b) ()
Figure 10.5
Forn € Z* we let b, count the number of ways we can cover (or tile) a 2 X n chessboard
using our 2 X 1 and 1 X 2 dominoes. Here b; = 1, fora 2 X 1 chessboard necessitates one
2 X 1 (vertical) domino. A2 X 2 chessboard can be covered in two ways — using two 2 X 1]
(vertical) dominoes or two 1 X 2 (horizontal) dominoes, as shown in part (c) of the figure.
Hence 6) = 2. Forn > 3, consider the last (nth) column of a2 X n chessboard. This column
can be covered in two ways.
i) By one 2 X | (vertical) domino: Here the remaining 2 X (nm — 1) subboard can be
covered in b,_1 ways.
ii) By the right squares of two 1 X 2 (horizontal) dominoes placed one above the other:
Now the remaining 2 X (n — 2) subboard can be covered in b,_2 ways.
Since these two ways have nothing in common and deal with all possibilities, we may write
Dy = by) + On-2, n > 3, bh = 1, bz = 2.
We find that b, = F,,.1, so here is another situation where the Fibonacci numbers arise. The
result from Example 10.10 gives us b, = (1//5)[((1 + /5)/2)"*! — (1 = V5) /2)"*4],
n>,
At this point we examine an interesting application where the number a = (1 + /5)/2
EXAMPLE 10.13
plays a major role. This application deals with Gabriel Lamé’s work in estimating the num-
ber of divisions used in the Euclidean algorithm to find gcd(a, b), where a, b € Z* with
a > b> 2. To find this estimate we need the following property of the Fibonacci numbers,
which can be established by the alternative form of the Principle of Mathematical Induction.
(A proof is requested in the Section Exercises.)
Property: For n > 3, Fy > a"~?.
Addressing the problem at hand — namely, estimating the number of divisions when
the Euclidean algorithm is used to find gcd(a, b)— we recall the following steps from
Theorem 4.7.
10.2. The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients 459
Letting ry = a andr; = b, we have
ro=qiri tro, O<n <r;
ry, = q2r2 +13, O<7r3<1r%
r2 = Q3r3 +14, 0<m4 <7
Yn—-2 = Gn-1¥n-1 thn, O<rn <Pn-1
Vn-1 = Gn¥n-
So ry, the last nonzero remainder, is gcd(a, b).
From the subscripts on r we see that n divisions have been performed in determining
rn = gcd(a, b). In addition, g; > 1, for all 1 <i <n —1, and g, > 2 because r, < ry_}.
Examining the n nonzero remainders ry, rn_1, fn-2, .-- , 72, and r; (= b), we learn that
fn > O, SOf, > 1 = Fo.
[Gn = 2) A Cn= DIS Pa- = Qntn
= 2-1 = 2 = F3
Fn—2 = Qn-1ln-1 +n 21 rn-1 tin S34 Fy = Fy
2 = @3r3 tre > le r3trq > Fa-1 + Fa-2 = Fa
b=r1=Qr2tr321-nmtr3>
Fat Fai = Fay.
Therefore, if n divisions are performed by the Euclidean algorithm to determine gcd(a, b),
witha > b > 2, thenb > F,4). So by virtue of the property introduced earlier, we may write
b> atD~2 = gt! = [(1 + /5)/2]"-!. Consequently, we find now that
—]
b>a" | => logiy b > logy(a"!) = (n — 1) logy @ > —
since logig @ = log;,[(1 + V5)/2] = 0.208988 > 0.2 = :
At this point suppose that 10‘! < b < 10*, so that the decimal (base 10) representation
of b has k digits. Then
— 1
k = logy) 10* > logyb > —, and n<5k+1.
With n, k € Zt we have n < 5k +1 =n <5k, and this last inequality now completes a
proof for the following.
Lamé’s Theorem: Let a, b € Z* with a > b > 2. Then the number of divisions needed, in
the Euclidean algorithm, to determine gcd(a, b) is at most 5 times the number of decimal
digits in b.
Before closing this example, we learn one more fact from Lamé’s Theorem. Since b > 2,
it follows that logjg 6 > logiy 2, so 5 logyg b = 5 logyg 2 = logy, 2° = logy) 32 > 1. From
above we know that n — 1 < 5 logi, b, so
n<1+5 logy b <5 logy b +5 logy b = 10 log), b
and n € O(log), b). [Hence, the number of divisions needed, in the Euclidean algorithm,
to determine gcd(a, 6), fora, b € Z* with a > b > 2, is O(log), b)
— that is, on the order
of the number of decimal digits in 5.]
460 Chapter 10 Recurrence Relations
Returning to the theme of the section we now examine a recurrence relation in a computer
science application.
In many programming languages one may consider those legal arithmetic expressions,
EXAMPLE 10.14
without parentheses, that are made up of the digits 0, 1, 2,..., 9 and the binary operation
symbols +, *, /. For example, 3 + 4 and 2 + 3 * 5 are legal arithmetic expressions; 8 + * 9
is not. Here 2+ 3 x 5 = 17, since there is a hierarchy of operations: Multiplication and
division are performed before addition. Operations at the same level are performed in their
order of appearance as the expression is scanned from left to right.
For n € Z*, let a, be the number of these (legal) arithmetic expressions that are made
up of n symbols. Then a; = 10, since the arithmetic expressions of one symbol are the 10
digits. Next a2 = 100. This accounts for the expressions 00, 01,..., 09, 10, 11,..., 99.
(There are no unnecessary leading plus signs.) When n > 3, we consider two cases in order
to derive a recurrence relation for a,:
1) If x is an arithmetic expression of n — 1 symbols, the last symbol must be a digit.
Adding one more digit to the right of x, we get 10a,_, arithmetic expressions of n
symbols where the last two symbols are digits.
2) Now let y be an arithmetic expression of » — 2 symbols. To obtain an arithmetic
expression with n symbols (that is not counted in case 1), we adjoin to the right of y one
of the 29 two-symbol expressions +1, ..., +9, +0, «1, ..., *9, x0, /1,..., /9.
From these two cases we have a, = 10a,~-, + 29a,_2, where n > 3 and a; = 10, a2 =
100. Here the characteristic roots are 5+3/6 and the solution is a, = (5/(V6)) «
[(5 + 36)" — (5 — 3/6)" for n > 1. (Verify this result.)
Another way to complete the solution of this problem is to use the recurrence relation
Gn = 10a,~) + 29Gn-2, with a2 = 100 and a, = 10, to calculate a value for ay — namely,
ay = (a2 — 10a,)/29 = 0. The solution for the recurrence relation
an = 10a,,—1 + 29a, _>, n> 2, ag = 0, a, = 10
dn = (5/3-V6))[(5 + 36)" — (5 -3V6)"], n=O.
A second method for counting palindromes arises in our next example.
In Fig. 10.6 we find the palindromes of 3, 4, 5, and 6 — that is, the compositions of 3, 4, 5,
EXAMPLE 10.15
and 6 that read the same left to right as right to left. (We saw this concept earlier in Example
9.13.) Consider first the palindromes of 3 and 5. To build the palindromes of 5 from those
of 3 we do the following:
i) Add 1 to the first and last summands in a palindrome of 3. This is how we get
palindromes (1’) and (2’) for 5 from the respective palindromes (1) and (2) for
3. [Note: When we have a one summand palindrome n we get the one summand
palindrome n + 2. That is how we build palindrome (1’) for 5 from palindrome (1)
for 3.]
ii) Append “1+” to the start and ‘‘+ 1” to the end of each palindrome of 3. This technique
generates the palindromes (1”) and (2”) for 5 from the respective palindromes (1)
and (2) for 3.
10.2 The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients 461
(1) 3 (1’) 5 (1) 4 (1’) 6
(2) 1+141 ] (2 2+1+2 (2) 1+2+1 (2’) 24+2+2
(1) 14+341 (3) 24+2 (3’) 3+3
(2”) 1+14+14+141 11 4 1+14+141 | @) 2+1+1+2
1”) 14441
(2”) 141424141
(3) 14+2+4+2+1
(4) 1+14+141+141
Figure 10.6
The situation is similar for building the palindromes of 6 from those of 4.
The preceding observations lead us to the following. Forn € Z*, let p, count the number
of palindromes of n. Then
Pn = 2Pn-2, n> 3, pi =), pr
= 2.
Substituting p, = cr”, for c, r #0, n > 1, into this recurrence relation, the resulting char-
acteristic equation is r? — 2 =0. The characteristic roots are r = + 2,80 Pn = C| (/22)?+
C(— J2)". From
1= pp =c1(V2) + en(—V2)
2= py = ¢(V2)? + e(~V2)
we find that c; = € + sn), C2 = ( -- on). sO
no (besta) ora (fe sacar wet
Unfortunately, this does not look like the result found in Example 9.13. After all, that answer
contained no radical terms. However, suppose we consider n even, say n = 2k. Then
ne (Sa gtg) wars ($b) vem
=
1 1 +
]
(5-55)
1 Qe = DK — gn/2
(5+"oA
=A) 2/2
For n odd, say n = 2k — 1, k € Z*, we leave it for the reader to show that p, = 2*-' =
nin—D/2,
The preceding results can be expressed by p, = 2!"/7), n > 1, as we found in Example
9.13.
The recurrence relation for the next example will be set up in two ways. In the first part
we shall see how auxiliary variables may be helpful.
| EXAMPLE 10.16 | Find a recurrence relation for the number of binary sequences of length n that have no
consecutive 0’s.
462 Chapter 10 Recurrence Relations
a) For n > 1, let a, be the number of such sequences of length n. Let a) count those
that end in 0, and a!) those that end in 1. Then a, = a + a“.
We derive a recurrence relation for a,, n > 1, by computing a, = 2 and then con-
sidering each sequence x of length n — 1 (> 0) where x contains no consecutive 0’s.
If x ends in 1, then we can append a 0 or a | to it, giving us 2a of the sequences
counted by a,. If the sequence x ends in 0, then only 1 can be appended, resulting in
a, sequences counted by a,,. Since these two cases exhaust all possibilities and have
nothing in common, we have
n n—|
+ N\
The ath position The ath position
can be 0 or 1. can only be 1.
If we consider any sequence y counted in a,_2 we find that the sequence y1 is counted
in a‘. Likewise, if the sequence z! is counted in a, then z is counted in a,_>.
qd)
Consequently, a,_2 = a n_ and
1 1 0) 1
ay = a + [a +a” ] = a + Gn—1 = An—-1 + Gn-2.
Therefore the recurrence relation for this problem is ad, = @,—,| + Gn—2, where n > 3
and a, = 2, a) = 3. (We leave the details of the solution for the reader.)
b) Alternatively, if m > 1 and a, counts the number of binary sequences with no con-
secutive 0’s, then a, = 2 and a2 = 3, and for n > 3 we consider the binary sequences
counted by a,. There are two possibilities for these sequences:
(Case 1: The nth symbol is 1) Here we find that the preceding n — 1 symbols form
a binary sequence with no consecutive 0’s. There are a,_, such sequences.
(Case 2: The nth symbol is 0) Here each such sequence actually ends in 10 and the
first n — 2 symbols provide a binary sequence with no consecutive 0’s. In this case
there are a,—2 such sequences.
Since these two cases cover all the possibilities and have no such sequence in common,
we may write
An = An,
+ Qn-2, n> 3, a = 2, ay = 3,
as we found in part (a).
In both part (a) and part (b) we can use the recurrence relation and a; = 2, a) = 3 to
go back and determine a value for ay —namely, a9 = a2 ~ a, = 3 — 2 = 1. Then we can
solve the recurrence relation
Qn = An—| + An-2; n> 2, a@ = 1, a, = 2.
Before going any further we want to be sure that the reader understands why a general
argument is needed when we develop our recurrence relations. When we are proving a
theorem we do not draw any general conclusions from a few (or even, perhaps, many)
particular instances. The same is true here. The following example should serve to drive
this point home.
We start with identical pennies and let a, count the number of ways we can arrange these
EXAMPLE 10.17 pennies— contiguous in each row where each penny above the bottom row touches two
pennies in the row below it. (In these arrangements we are not concerned with whether any
10.2. The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients 463
given penny is heads up or heads down.) In Fig. 10.7 we have the possible arrangements
for 1 <n <6. From this it follows that
a; = 1, ay = 1, a3 = 2, ag = 3, as = 5, and de = 8.
Consequently, these results might suggest that, in general, a, = F,, the nth Fibonacci
number. Unfortunately, we have been led astray, as one finds, for example, that
a7 = 12 #13 = Fy, ag = 18 # 21 = Fy, and dg = 26 # 34 = Fo.
(The arrangements in this example were studied by F. C. Auluck in reference [2].)
(n
= 6)
Figure 10.7
The last two examples for case (A) show us how to extend the results for second-order
recurrence relations to those of higher order.
| EXAMPLE 10.18 Solve the recurrence relation
24n43 = An42 + 2An41 — An, n>0O, ao = 0, a, = 1, ay = 2.
Letting a, = cr" forc, r # Oandn > 0, we obtain the characteristic equation 2r3 — r* —
2r+1=0= (2r —- 1) ~ Dv + 1). The characteristic roots are 1/2, 1, and —1, so the
solution is a, = cy(1)" + co(—1)" + 03(1/2)" = cy + c2(—1)" + €3(1/2)". [The solutions
1, (—1)", and (1/2)” are called linearly independent because it is impossible to express
464 Chapter 10 Recurrence Relations
any one of them as a linear combination of the other two.'] From 0 = ay, 1 = a), and 2 =
ay, we derive cy = 5/2, cr = 1/6, cz = —8/3. Consequently, a, = (5/2) + (1/6)(—1)” +
(—8/3)(1/2)",n = 0.
For n > 1 we want to tile a 2 X n chessboard using the two types of tiles shown in part (a)
EXAMPLE 10.19
of Fig. 10.8. Letting a, count the number of such tilings, we find that a, = 1, since we can
tile a 2 X 1 chessboard (of one column) in only one way — using two 1 X | square tiles.
Part (b) of the figure shows us that a2 = 5. Finally, for the 2 X 3 chessboard there are 11
possible tilings: (1) one that uses six 1 X 1 square tiles; (11) eight that use three 1 x 1 square
tiles and one of the larger tiles; and (iii) two that use two of the larger tiles. When n > 4 we
consider the nth column of the 2 < n chessboard. There are three cases to examine:
1) the nth column is covered by two 1 X 1 square tiles — this case provides a, _, tilings;
2) the (x — 1)st and nth columns are tiled with one 1 X 1 square tile and one larger
tile— this case accounts for 4a,_2 tilings; and
3) the (n — 2)nd, (mz — 1)st, and nth columns are tiled with two of the larger tiles
— this
results in 2a,_3 tilings.
(a) (b)
Figure 10.8
These three cases cover all possibilities and no two of the cases have anything incommon,
so
Qn = An—) + 4€n_2 + 2ay_3, n> 4, a, =, a, = 5, a3 = 11.
The characteristic equation x* — x? — 4x — 2 = Ocan be written as
(x + 1)(x? — 2x — 2) = 0, so the characteristic roots are —1, 1 + V3, and 1 — /3. Con-
sequently, a, = ci(—1)" +e.(1 + V3)" +o, — V3)", n>1. From 1 =a, = —c, +
o(l+73)+ea( — V3), SS a =e, +1 + V3)? +6301 — V3)*, and 11 =a; =
—c) tal + V3) +03(1 — V3), we have c) = 1,02 = 1/3, and c3 = —1/73. So
ay = (-1)" + 1/3) + V3)" + (-1/V3)0 — V3)", an 1.
Case (B): (Complex Roots)
Before getting into the case of complex roots, we recall DeMoivre’s Theorem:
(cos
@ +i sin@)" = cosn@
+i sin nd, n> 0.
[This is part (b) of Exercise 12 of Section 4.1.]
* Alternatively, the solutions 1, (—1)”, and (1/2)” are linearly independent, because if k1, k2, 3 are real
numbers, and k (1) + ko(—1)” + k3(1/2)" = 0 for alin EN, then ky = kz = k3 = 0.
10.2 The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients 465
Ifz =x +iy €C,z #0, wecan write z = r(cos@ +i sin @), where r = ,/x* + y? and
(y/x) = tan 6, forx # 0. Ifx = 0, then fory > 0,
Z= yi = yi sin(/2) = y(cos(z/2) +i sin(z/2)),
and for y < 0,
z= yi = |y|f sin(372/2) = |y|(cos(3z/2) +i sin(37/2)).
In all cases, z” = r”(cosn@ +7 sin n@), for n > 0, by DeMoivre’s Theorem.
Determine (1 + /3 i)".
EXAMPLE 10.20
Figure 10.9 shows a geometric way to represent the complex number 1 + /3 i as the
point (1, 3) in the xy-plane. Here r = V 1? + (V3)? = 2, and 6 = 77/3.
>
(1,V3)
Figure 10.9
So 1+ V3i = 2(cos(z/3) +i sin(/3)), and
(1 + 73 i)!9 = 2! (cos(102
/3) + 7 sin(107/3)) = 2!°(cos(42/3) + i sin(47/3))
= 2!9((-1/2) — (V/3/2)i) = (-29)U + V3 3).
We’ll use such results in the following examples.
Solve the recurrence relation a, = 2(a@,_| — @,—2), where n > 2 and dp = 1, a; = 2.
EXAMPLE 10.21
Letting a, = cr", for c,r #0, we obtain the characteristic equation r? — 2r + 2 =
0, whose roots are 1 +7. Consequently, the general solution has the form c\({ +7)" +
c2(1 — i)", where c, and cz presently denote arbitrary complex constants. [As in case (A),
there are two independent solutions: (1 + 7)” and (1 — i)”.]
142 = V2(cos(7/4) +i sin(r/4))
and
1 —i = V2(cos(—2/4) +i sin(—2/4)) = V2(cos(1/4) — i sin(7/4)).
466 Chapter 10 Recurrence Relations
This yields
Gn = CCL +4)" + eo — i)"
= ci[/2(cos(/4) +i sin(/4))]" + co[/2(cos(—7/4) +i sin(—m /4))]"
~ e(/2y" (cos(nz /4) + 7 sin(nz /4)) + c2(/2)" (cos(—nz /4) +i sin(—nz /4))
= 1 (/2)" (cos(nz /4) +i sin(n7/4)) + 67 (/2)" (cos(nz /4) — i sin(nz /4))
= (/2)"[k; cos(nm/4) + kp sin(nr/4)],
where k; = cy +c) and kp = (cy — €2)i.
1 = ay = [ky cos 0+ k2 sin 0] = ky
2 =a, = V2[1 - cos(2/4) + ko sin(z/4)], or2 =1+k2, and ky = 1.
The solution for the given initial conditions is then given by
dy = (/2)"[cos(nz /4) + sin(n7/4)], n>=0.
[Note: This solution contains no complex numbers. A small point may bother the reader here.
How did we start with c;, cp complex and end up with k, = cy +c, and ky = (c; — c2)i
real? This happens if c), cz are complex conjugates.]
Let us now examine an application from linear algebra.
For b € Rt, consider the n X n determinant’ D,, given by
EXAMPLE 10.22
bb O90 0 00 0 0 0
b b b 0 0 0 0 0 0 0
0 bbb 0 00 0 0 0
0 0 b b b 00 0 0 0
000 0 0 b bb O 0
00 0 0 0 0b bb O
0 0 0 0 0 0 0 b b b
00 0 0 0 0 0 0 b b
Find the value of D,, as a function of n.
Let a,, n > 1, denote the value of the n X n determinant D,,. Then
b b b b O
a, = |b| =b- and a=), =o (and a;= |b b b = —p*)
0 b b
"The expansion of determinants is discussed in Appendix 2.
10.2. The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients 467
Expanding D,, by its first row, we have D, =
bbO 0 --- 0 0 0 0 bb O 0 0 0 0 0
bb b OD --- 0 D0 0 0 0 b bO --- 0 0 0 0
O b bb «.-- 0 0 0 0 O b bb: 0 0 0 0
ee ;
000 0 .::. b bb O 0 0 0 0 b b b 0
00 0 0 ::. 0 b b b 0 00 0 --- O b b b
00 0 0 .:--:. 0 0 b b 0 00 0 --- 0 0 b b
(This is D, - |.)
When we expand the second determinant by its first column, we find that D, = bD,~1 —
(b)(b)D,_2 = bD,_~1 — b* Dn_2. This translates into the relation a, = ba,_, — b7a,_2, for
n>3,a, =b,a,=0.
If we let a, = cr” forc, r # 0 and n > 1, the characteristic equation produces the roots
b[(1/2) £iV3/2).
Hence
Gy = Cy[b((1/2) + 13/2)" + c2[b(1/2) — 13/21"
= b"[c\(cos(/3) +2 sin(/3))" + c2(cos(z/3) — i sin(a/3))"]
= b"[k, cos(nz/3) + kz sin(nz/3)].
b = ay = blk, cos(/3) + ky sin(7/3)], so 1 = ky(1/2) + ko(V/3/2), or ky + V3ky = 2.
0 = ay = b*[k, cos(27/3) + ky sin(277/3)), so 0 = (k,)(—1/2) + ka(V3/2), or
ky > J3 k>.
Hence k; = 1, ky = 1//3 and the value of D, is
b" [cos(nz/3) + (1/73) sin(nz /3)].
Case (C): (Repeated Real Roots)
Solve the recurrence relation @,42 = 4dy,4, — 4a,, where n > 0 and ay = 1, a, = 3.
EXAMPLE 10.23 _| As in the other two cases, we let a, = cr”, where c, r # 0 and n > 0. Then the charac-
teristic equation is 77 — 4r + 4 = O and the characteristic roots are both r = 2. (Sor = 2is
called “a root of multiplicity 2.”) Unfortunately, we now lack two independent solutions: 2”
and 2” are definitely multiples of each other. We need one more independent solution. Let
us try g(n)2” where g(n) is not a constant. Substituting this into the given relation yields
g(n +2)2"*? = 4e(n + 1)2"*! — 4g(n)2”
or
g(n + 2) = 2g(n + 1) — gin). (1)
One finds that g(n) = n satisfies Eq. (1)." So n2” is a second independent solution. (It is
independent because it is impossible to have n2” = k2” for all n > Oif k is a constant.)
* actually, the general solution is g(a) = an + b, for arbitrary constants a, b, witha # 0. Here we chosea = |
and 6 = 0 to make g() as simple as possible.
468 Chapter 10 Recurrence Relations
The general solution is of the form a, = c,(2”) + e.n(2"). With ay = 1, a; = 3 we find
Gn = 27 + (1/2)n(2") = 27 + n(2"-") nr > 0.
In general, if Cog, + Cyay_y + Co@n—2 ++ ++ + Cyay_x, = 0, with Cp (€ 0), Cy, Cr,
..., Cx (#0) real constants, and r a characteristic root of multiplicity m, where
2<m <k, then the part of the general solution that involves the root r has the form
Aor" + Aynr® + Agn?r” + +++ Amin tr"
fae + Ann )r",
= (Ao fe Aft + Ann?
where Ap, Aj, Az,..., Am—1 are arbitrary constants.
Our last example involves a little probability.
If a first case of measles is recorded in a certain school system, let p, denote the probability
EXAMPLE 10.24
that at least one case is reported during the nth week after the first recorded case. School
records provide evidence that py = pp—1 — (0.25) py_2, where n > 2. Since po = 0 and
pi = 1, if the first case (of a new outbreak) is recorded on Monday, March 3, 2003, when
did the probability for the occurrence of a new case decrease to less than 0.01 for the first
time?
With p, = cr” force, r # 0, the characteristic equation for the recurrence relation is r* 2 —
r + (1/4) = 0 = (r ~ (1/2))?. The general solution has the form p, = (c, + con)(1/2)",
n > 0. For po = 0, p; = 1, we get c;} = 0, cp = 2, 80 p, = n2-"*
|» -n > 0.
The first integer n for which p, < 0.01 is 12. Hence, it was not until the week of May
19, 2003, that the probability of another new case occurring was less than 0.01.
5. Answer the question posed in Exercise 4 if (a) the motor-
EXERCISES 10.2 cycles come in two distinct models; (b) the compact cars come
in three different colors; and (c) the motorcycles come in two
1. Solve the following recurrence relations. (No final answer
distinct models and the compact cars come in three different
should involve complex numbers.)
colors.
a) a, = Sa,_, + 64,2, n>2, ag=1, a, =3
b) 24,42 — Ildn41 + 5a, =0, n>O0, ap =2, a, = —8 6. Answer the questions posed in Exercise 5 if empty spaces
are allowed.
C) Qn42 +a, =0, n>O0, a9 =0, a, =3
d) a, — 6@,-; + 9a,_2 = 0, n>2, a9 =5, a) = 12 7. In Exercise 12 of Section 4.2 we learned that Fy + F, +
Fy 4-+++ Fy, = )0"_5 F = Fra — 1. This is one of many
e) a, + 2a,-) +2a,-2 =0, n>2, ag =1, a, =3
such properties of the Fibonacci numbers that were discovered
2. a) Verify the final solutions in Examples 10.14 and 10.23. by the French mathematician Francois Lucas (1842-1891). Al-
b) Solve the recurrence relation in Example 10.16. though we established the result by the Principle of Mathemat-
3. If ag = 0, a) = 1, a2 = 4, and a3 = 37 satisfy the recur- ical Induction, we see that it is easy to develop this formula by
rence relation @,42 + ba,,; + ca, = 0, where n > 0 and b,c adding the system ofn + | equations
are constants, determine b, c and solve for a,. Fo = Fy — Fy
4. Find and solve a recurrence relation for the number of ways Py = F3—
Fy
to park motorcycles and compact cars in a row of n spaces if
each cycle requires one space and each compact needs two. (All
cycles are identical in appearance, as are the cars, and we want Fra = Fray — Fr
to use up all the 7 spaces.) F,, = Fi42 ~ Fy.
10.2. The Second-Order Linear Homogeneous Recurrence Relation with Constant Coefficients 469
Develop formulas for each of the following sums, and then number of ways to stack n of these poker chips so that there are
check the general result by the Principle of Mathematical In- no consecutive blue chips.
duction. 13. An alphabet & consists of the four numeric characters 1,
a) Fi + Py + Fs +-++++ Fo), where n € Z* 2, 3, 4, and the seven alphabetic characters a, b, c, d, e, f, g.
b) Fo + Fy + Fy +--++ Fo,, where n € Zt Find and solve a recurrence relation for the number of words of
length n (in &=*), where there are no consecutive (identical or
8. a) Prove that
distinct) alphabetic characters.
— Fr — 1+ V5
lim = . 14. An alphabet © consists of seven numeric characters and
noo Fy, 2 k alphabetic characters. For n > 0, a, counts the number of
(This limit has come to be known as the golden ratio and is strings (in &*) of length n that contain no consecutive (identi-
often designated by a, as we mentioned in Example 10.10.) cal or distinct) alphabetic characters. If @,42 = 7ay,41 + 63a,
b) Consider a regular pentagon ABCDE inscribed in a cir- n > QO, what is the value of k?
cle, as shown in Fig. 10.10. 15. Solve the recurrence relation 42 = Gy41G,,n = 0,do = 1,
i) Use the law of sines and the double angle formula a, =2.,
for the sine to show that AC/AX = 2 cos 36°. 16. For > 1, let a, be the number of ways to write # as an or-
ii) As cos 18° = sin 72°
dered sum of positive integers, where each summand is at least
= 4sin 18° cos 18°(1 — 2 sin? 18°) (Why?), show 2. (For example, as; = 3 because here we may represent 5 by 5,
that sin 18° is a root of the polynomial equa-
by 2 + 3, and by 3 + 2.) Find and solve a recurrence relation
tion 8x7 — 4x + 1 =0, and deduce that sin 18° = for a,.
(/5 — 1)/4.
17. a) Fora fixed nonnegative integer n, how many composi-
c) Verify that AC/AX = (1 + J5)/2. tions of 2 + 3 have no 1 as asummand?
b) For the compositions in part (a), how many start with
(1) 2; (li) 3; Gli) k, where 2<k <n+1?
c) How many of the compositions in part (a) start with
n+2orn+3?
d) How are the results in parts (a)—(c) related to the formula
derived at the start of Exercise 7?
18. Determine the points of intersection of the parabola y =
x* — | and the liney = x.
E D 19. Find the points of intersection of the hyperbola y = 1 + +
Figure 10.10 and the line y = x.
20. a) Fora = (1 + /5)/2, show thate? =a + 1.
b) If € Z*, prove that a” = a@F, + F,_).
9, For n > 0, let a, count the number of ways a sequence
of 1’s and 2’s will sum to a. For example, a3 = 3 because 21. Let F,, denote the mth Fibonacci number, for n > 0, and
(1) 1, 1, 1; (2) 1, 2; and (3) 2, 1 sum to 3. Find and solve a recur- let « = (1+ J5)/2. For n > 3, prove that (a) F, > a”? and
rence relation for a,. (b) F, <a"!,
10, For© = {0, 1}, let A C ©*, where A = {00, 1}. Forn > 1, 22. a) Forn € Z*, let a, count the number of palindromes of
let a, count the number of strings in A* of length n. Find and 2n. Then a,4) = 2a,,n > 1, a, = 2. Solve this first-order
solve a recurrence relation for a,,. (The reader may wish to refer recurrence relation for a,.
to Exercise 25 for Section 6.1.) b) For n € Z*, let b, count the number of palindromes of
11. a) For n > 1, let a, count the number of binary strings of 2n — 1. Set up and solve a first-order recurrence relation
length n, where there are no consecutive 1’s. Find and solve for b,,.
a recurrence relation for a,. (You may want to compare your solutions here with those given
b) For n > |, let b, count the number of binary strings of in Examples 9.13 and 10,15.)
length n, where there are no consecutive 1’s and the first 23. Consider ternary strings — that is, strings where 0, 1, 2 are
and last bit of the string are not both 1. Find and solve a the only symbols used. For n > 1, let a, count the number of
recurrence relation for b,. ternary strings of length n where there are no consecutive 1’s
12. Suppose that poker chips come in four colors — red, white, and no consecutive 2’s, Find and solve a recurrence relation
green, and blue. Find and solve a recurrence relation for the for a,.
470 Chapter 10 Recurrence Relations
24. For x > 1, let a, count the number of ways to tilea2 x n 30. For n > 1, let D,, be the following n X n determinant.
chessboard using horizontal (1 X 2) dominoes [which can also 2 1 0 0 0 :-. 0 0 0 0
be used as vertical (2 X 1) dominoes} and square (2 X 2) tiles.
1 2 1 0 0 :-- 0 0 0 0
Find and solve a recurrence relation for a,.
1 2 1 0 :.. 0 0 0 0
25. In how many ways can one tile a2 X 10 chessboard using
dominoes and square tiles (as in Exercise 24) if the dominoes 0 0 0 0 0 +--+. 1 2 +1 =0
come in four colors and the square tiles come in five colors? 0 0 0 0 0:5. O t 2 1
26. Let © = {0, 1}andA = {0, 01, 11} C X*. Forn > 1, leta, 0 0 0 0 0 +. 0 0 1 2
count the number of strings in A* of length v. Find and solve a Find and solve a recurrence relation for the value of D,,.
recurrence relation for a,.
27. Let & = {0, j}andA = {0, 01, O11, 111} C U*. Porn > 4, 31. Solve the recurrence relation a?,,—5az,, n+]
2
+ 4a? =0,
let a, count the number of strings in A* of length n. Find and wheren > 0 and apy = 4, ay = 13.
solve a recurrence relation for a,. 32. Determine the constants b and c if a, = c) + .¢2(7"),n > 0,
28. Let © = {0,1} and A = {0, 01,011, 0111, 1111} Cc &*. is the general solution of the relation a,42+ bay.) + ca, =
For n > 1, let a, count the number of strings in A* of length n. O,n>0.
Find and solve a recurrence relation for a,,.
33. Prove that any two consecutive Fibonacci numbers are rel-
29. A particle moves horizontally to the right. For n € Z*, the atively prime.
distance the particle travels in the (7 + 1)st second is equal to
twice the distance it travels during the nth second. If x,, n > 0, 34. Write a computer program (or develop an algorithm) to
denotes the position of the particle at the start of the (n + 1)st determine whether a given nonnegative integer is a Fibonacci
second, find and solve a recurrence relation for x,,, where x9 = 1 number.
and x; = 5.
10.3
The Nonhomogeneous
Recurrence Relation
We now turn to the recurrence relations
an + Ciay-1 = fn), n>1, (1)
Qn + Cyay—4 + C2an-2 = f(n), n> 2, (2)
where C, and C) are constants, C; # 0 in Eq. (1), C2 #0, and f(n) is not identically 0.
Although there is no general method for solving all nonhomogeneous relations, for certain
functions f(n) we shall find a successful technique.
We start with the special case for Eq. (1), when C) = —1. For the nonhomogeneous
relation dy, — Gn—) = f(n), we have
a; =a + f())
ay =a, + f(2) =aot+ fl) + fQ)
a3 =a. + f(3) =a9+ fC) + fQ) + FG)
Gn =n. + f(n) = ay + fF) +-+- +f) =aot>- fli).
i=]
We can solve this type of relation in terms of n, if we can find a suitable summation
formula for }7"_, f (i).
10.3 The Nonhomogeneous Recurrence Relation 471
Solve the recurrence relation a, — d,_| = 3n”, where n > 1 and ap = 7.
EXAMPLE 10.25
Here f(n) = 3n’, so the unique solution is
n n 1
an = ao + 9) FD =T+3 QP =T4+ slain
+ Qn + I).
i=] i=]
When a formula for the summation is not known, the following procedure will handle
Eq. (1) for certain functions f(m), regardless of the value of C; (# 0). It also works for
the second-order nonhomogeneous relation in Eq. (2) — again, for certain functions f(n).
Known as the method of undetermined coefficients, it relies on the associated homogeneous
relation obtained when f (#) is replaced by 0.
For either of Eq. (1) or Eq. (2), we let a” denote the general solution of the associated
homogeneous relation, and we let a\?? be a solution of the given nonhomogeneous relation.
The term a” is called a particular solution. Then a, = a) 4 a\”) is the general solution
of the given relation. To determine a‘? we use the form of f (n) to suggest a form for av?)
Solve the recurrence relation a, — 3a@,-; = 5(7"), where n > 1 and ay = 2.
EXAMPLE 10.26
The solution of the associated homogeneous relation is a” = c(3"). Since f(n) = 5(7"),
we seek a particular solution a{”) of the form A(7”). As ai”? is to be a solution of the
given nonhomogeneous relation, we place a? = A(7") into the given relation and find
that A(7") — 3A(7"-!) = 5(7"), n > 1. Dividing by 7”~', we find that 7A — 3A = 5(7), so
A = 35/4, and al” = (35/4)7" = (5/4)7"t!, n > 0. The general solution is a, = c(3”) +
(5/4)7"*), With2 = ag = c + (5/4)(7), it follows that c = —27/4 anda, = (5/4)(7"*!) —
(1/4)(3"**), n > 0.
Solve the recurrence relation a, — 3a,_, = 5(3"), where n > 1 and ay = 2.
EXAMPLE 10.27
As in Example 10.26, a = c(3"), but here a” and f (n) are not linearly independent.
As a result we consider a particular solution ak” of the form Bn(3"). (What happens if we
substitute a,"(p) = B(3") into the given relation?)
Substituting a” = Bn3" into the given relation yields
Bn(3") — 3B(n — 1)(""')
= 5(3"), or Bn—Bin-1)=5, so B=S.,
Hence a, = al + al? = (c + 5n)3", n > 0. With ao = 2, the unique solution is a, =
(2 + 5n)(3"), n > 0.
From the two preceding examples we generalize as follows.
Consider the nonhomogeneous first-order relation
Gn + Cyan, = kr”,
where k is a constant and n € Z*. If r” is not a solution of the associated homogeneous
relation
Ay, + Cya,-1 = 0,
then a,’{p) = Ar”, where A is a constant. When r” is a solution of the associated homo-
geneous relation, then a” = Bnr", for B a constant.
472 Chapter 10 Recurrence Relations
Now consider the case of the nonhomogeneous second-order relation
Ay + Cydy—1 + Coan = kr",
where k is a constant. Here we find that
a) at” = Ar", for A aconstant, if r” is not a solution of the associated homogeneous
relation;
b) a” = Bnr", where B is a constant, if a”nt = cr" + cor”,1 where r; # rz and
c) a!” = Cn?r", for C a constant, when a® = (cy + e2n)r".
The Towers of Hanoi. Consider n circular disks (having different diameters) with holes in
EXAMPLE 10.28
their centers. These disks can be stacked on any of the pegs shown in Fig. 10.11. In the
figure, n = 5 and the disks are stacked on peg 1 with no disk resting upon a smaller one.
The objective is to transfer the disks one at a time so that we end up with the original stack
on peg 3. Each of pegs 1, 2, and 3 may be used as a temporary location for any disk(s), but
at no time are we allowed to have a larger disk on top of a smaller one on any peg. What is
the minimum number of moves needed to do this for n disks?
Figure 10.11
For n > 0, let a, count the minimum number of moves it takes to transfer n disks from
peg | to peg 3 in the manner described. Then, for n + 1 disks we can do the following:
a) Transfer the top n disks from peg 1 to peg 2 according to the directions that are given.
This takes at least a, moves.
b) Transfer the largest disk from peg 1 to peg 3. This takes one move.
c) Finally, transfer the n disks on peg 2 onto the largest disk, now on peg 3 — once again
following the specified directions. This also requires at least a, moves.
Consequently, at this point we know that a, ,; is no more than 2a, + 1— that is, aj41 <
2a, + 1. But could there be a method where we actually have a,4) < 2a, + 1? Alas, no!
For at some point the largest disk (the one at the bottom of the original stack — on peg 1)
must be moved to peg 3. This move requires that peg 3 has no disks on it. So this largest
disk may only be moved to peg 3 after the n smaller disks have moved to peg 2 [where they
are stacked in increasing size from the smallest (on the top) to the largest (on the bottom)].
Getting these n smaller disks moved, accordingly, requires at least a,, moves. The largest
10.3 The Nonhomogeneous Recurrence Relation 473
disk must be moved at least once to get it to peg 3. Then, to get the n smaller disks on top
of the largest disk (all on peg 3), according to the requirements, requires at least a, more
steps. SO dn41 = Gn +1 +4, = 2a, +1.
With 2a, + 1 < Gy4) < 2a, + 1, we now obtain the relation a,4; = 2a, + 1, wheren >
0 and ap = 0.
For Gn41 — 2a, = 1, we know that al") = c(2”). Since f(n) = 1 = (1)” is nota solution
Of Gn41 — 2a, = 0, we set at? _ A(1)”" = A and find from the given relation that A =
2A+1,so A = —1 anda, = c(2”) — 1. From ay = 0 = c — 1 it then follows that c = 1,
SO d, = 27 —1,n>0.
The next example arises from the mathematics of finance.
Pauline takes out a loan of S dollars that is to be paid back in 7 periods of time. If r is the
EXAMPLE 10.29
interest rate per period for the loan, what (constant) payment P must she make at the end
of each period?
We let a, denote the amount still owed on the loan at the end of the nth period (following
the mth payment). Then at the end of the (n + 1)st period, the amount Pauline still owes on
her loan is a, (the amount she owed at the end of the nth period) + ra, (the interest that
accrued during the (n + I)st period) — P (the payment she made at the end of the (nm + 1)st
period), This gives us the recurrence relation
Qn41 = Gn tran — P, O<n<T-l1, ay = S, ar
= 0.
For this relation a = c(1 +r)", while qi? = A since no constant is a solution of the
associated homogeneous relation. With al? = A we find A-—(1+r)A=-—P, so A=
P/r. From ao = S, we obtaina, = (S —(P/r)( +r)" + (P/r),O0<n <T.
Since 0 = ar = (S —(P/r)) +r)! + (P/r), it follows that
(P/r)=(P/r)—S)\i+r)? and P= (Sr)fl—-d4+ry 7].
We now consider a problem in the analysis of algorithms.
For n > 1, let S be a set containing 2” real numbers.
EXAMPLE 10.30
The following procedure is used to determine the maximum and minimum elements of
S. We wish to determine the number of comparisons made between pairs of elements in $
during the execution of this procedure.
If a, denotes the number of needed comparisons, thena, = 1. Whenn = 2,|S| = 2? =4,
so S = {x1, X2, V1, yo} = Sy; U Sp where S; = {x1, x2}, So = {y1, yo}. Since a; = 1, it takes
one comparison to determine the maximum and minimum elements in each of S$}, Sp.
Comparing the minimum elements of S$; and S) and then their maximum elements, we
learn the maximum and minimum elements in S and find that ay = 4 = 2a; + 2. In general,
if |S| = 2"+!, we write § = S, US) where |S,| = |S2| = 2”. To determine the maximum
and minimum elements in each of S; and Sz requires a, comparisons. Comparing the
maximum (minimum) elements of S$, and S2 requires one more comparison; consequently,
Gn41 = 24, +2,n > 1.
Here a = ¢(2") and al? = A, aconstant. Substituting al? into the relation, we find that
A=2A+2, or A = —2. Soa, = c2” — 2, and with a; = 1 = 2c — 2, we obtainc = 3/2.
Therefore a, = (3/2)(2”) — 2.
474 Chapter 10 Recurrence Relations
A note of caution! The existence of this procedure, which requires (3/2)(2”) — 2 com-
parisons, does not exclude the possibility that we could achieve the same results via another
remarkably clever method that requires fewer comparisons.
An example on counting certain strings of length 10, for the quaternary alphabet © =
{O, 1, 2, 3}, provides a slight twist to what we’ ve been doing so far.
For the alphabet © = {0, 1, 2, 3}, there are 4'° = 1,048,576 strings of length 10 (in D"°,
EXAMPLE 10.31
or &*). Now we want to know how many of these more than | million strings contain an
even number of 1’s.
Instead of being so specific about the length of the strings, we will start by letting a,
count those strings among the 4” strings in ©” where there are an even number of 1’s. To
determine how the strings counted by a,, for n > 2, are related to those counted by a,_1,
consider the nth symbol of one of these strings of length n (where there is an even number
of 1’s). Two cases arise:
1) The nth symbol is 0, 2, or 3: Here the preceding n — 1 symbols provide one of the
strings counted by a,_1. So this case provides 3a,_ of the strings counted by ap.
2) The nth symbol is 1: In this case, there must be an odd number of 1’s among the first
n — 1 symbols. There are 4"~! strings of length n — 1 and we want to avoid those
that have an even number of 1’s — there are 4”~! — a,_, such strings. Consequently,
this second case gives us 4"-l_@q__, of the strings counted by a,.
These two cases are exhaustive and mutually disjoint, so we may write
Gn = 3An-1 + (qr! — @n—1) = 2an-1 + qr! n> 2,
Here a = 3 (for the strings 0, 2, and 3). We find that a = c(2") and af? = A(4"~}).
Upon substituting a\?) into the above relation we have A(4"~!) = 2A(4"-7) +. 4") so
4A =2A+4 and A = 2. Hence, a, = c(2") + 2(4"~!), n > 2. From 3 = ay = 2c +2 it
follows that c = 1/2, so ay = 2"-' + 2(4"7!),n > 1.
When n = 10, we learn that of the 4!° = 1,048,576 strings in D!°, there are 2? +
2(4’) = 524,800 that contain an even number of 1’s.
Before continuing we realize that the answer here for a, can be checked by using the
exponential generating function f(x) = )(Poy an C (where ao = 1). From the techniques
developed in Section 9.4 we have
_(, x? | x? x4 1 x? x?
f(x) = tatytoojdtatar sitar yt tats tee
x ev te™ x x
e . _
7 . e .
€
_ (;) ot (5)] ox
(VW) SS 4x" (1) Be 2x)"
-()> n=0
7 +G)d n=0
a
Here a, = the coefficient of x in f(x) = (5) 4" + (5) 2" = 277! + 2(4""), as above.
10.3. The Nonhomogeneous Recurrence Relation 475
In 1904, the Swedish mathematician Helge von Koch (1870-1924) created the intriguing
EXAMPLE 10.32
curve now known as the Koch “snowflake” curve. The construction of this curve starts with
an equilateral triangle, as shown in part (a) of Fig. 10.12, where the triangle has side 1,
perimeter 3, and area /3/4. (Recall that an equilateral triangle of side s has perimeter 3s
and area s?./3/4.) The triangle is then transformed into the Star of David in Fig. 10.12(b)
by removing the middle one-third of each side (of the original equilateral triangle) and
attaching a new equilateral triangle whose side has length 1/3. So as we go from part (a)
to part (b) in the figure, each side of length 1 is transformed into 4 sides of length 1/3,
and we get a 12-sided polygon of area (/3/4) + (3)(/3/4)(1/3)? = ¥3/3. Continuing
the process, we transform the figure of part (b) into that of part (c) by removing the middle
one-third of each of the 12 sides in the Star of David and attaching an equilateral tri-
angle of side 1/9 (= (1/3)*). Now we have [in Fig. 10.12(c)] a 47 (3)-sided polygon whose
area is
(V3/3) + (4)3)(V3/4)[(1/3)°7? = 10V3/27.
(a) (b) (¢)
Figure 10.12
For n > 0, let a, denote the area of the polygon P, obtained from the original equilateral
triangle after we apply n transformations of the type described above [the first from P
in Fig. 10.12(a) to P; in Fig. 10.12(b) and the second from P; in Fig. 10.12(b) to P; in
Fig. 10.12(c)]. As we go from P, (with 4”(3) sides) to P,41 (with 4”+'(3) sides), we find
that
Gny1 = Gn + (4"(3))(V3/4)(1/3"9!)? = an + (1/(4V3))(4/9)"
because in transforming P, into P,,,; we remove the middle one-third of each of the 4”(3)
sides of P,, and attach an equilateral triangle of side (1/3”*').
The homogeneous part of the solution for this first-order nonhomogeneous recurrence
relation is al) = A(1)” = A. Since (4/9)” is not a solution of the associated homoge-
neous relation, the particular solution 1s given by ay?) = B(4/9)", where B is a constant.
Substituting this into the recurrence relation a@)41 = dy + (1/(4V3))(4/9)", we find that
B = (-9/5)(1/(4/3)). Consequently,
ay, = A + (—9/5)(1/(4V3))(4/9)" = A — (1/(5V3))(4/9)""!, an = 0.
Since /3/4 = ay = A — (1/(5V3))(4/9)—|, it follows that A = 6/(5/3) and
an = (6/(5V3)) — 1/(5V3))(4/9)""! = 1/6V3))16 — 4/9", n=O.
476 Chapter 10 Recurrence Relations
[Asn grows larger, we find that (4/9)"—! tends to 0 and a, approaches 6/(5./3). We can also
obtain this value by continuing the calculations we had before we introduced our recurrence
relation, thus noting that this limiting area is also given by
(73/4) + (V3/4)(3)(1/3)? + (V3/4)(4)
GB) 1/32)? + (V3 /4)(42)(3)(1/33)? + -
= (V3/4) + (V3/4)(3) 0 4" (1/3"tty? = (3/4) + (1/(4V3)) $0479)"
n=(0) n=0
= (73/4) + (1/(4V3))E1/C — (4/9))] = (73/4) + (1/(4-V3)) (9/5) = 6/(5V3)
by using the result for the sum of a geometric series from part (b) of Example 9.5.]
For n > 1, let X, = {1, 2,3,...,n}; PCX,) denotes the power set of X,,. We want to
EXAMPLE 10.33
determine a, the number of edges in the Hasse diagram for the partial order (P(X,,), C).
Here a, = 1 and a2 = 4, and from Fig. 10.13 it follows that
a3 = 2a, +2’.
{2, 3}
Figure 10.13
This is because the Hasse diagram for (P(X3), C) contains the a2 edges in the Hasse di-
agram for (P(X2), C) as well as the a) edges in the Hasse diagram for the partial order
({{3}, €1, 3}, {2, 3}, {1, 2, 3}}, ©). [Note the identical structure shared by the partial or-
ders (P({1, 2}), ) and ({{3}, {1, 3}, {2, 3}, {1, 2, 3}}, C).] In addition, there are 2? other
(dashed) edges — one for each subset of {1, 2}. Now forn > 1, consider the Hasse diagrams
for the partial orders (P(X,,), C) and ({T U {mn + 1}|T € P(X,,)}, C). Foreach S € P(X,),
draw an edge from S$ in (P(X,,), C) toS U {nm + lyin ({7 U{n + 1}|T € P(X, )}, C). The
result is the Hasse diagram for (P(X,,41), C). From the construction we see that
Gn41 = 2a, + 2", n>, a, =1.
The solution to this recurrence relation, with the given condition a, = 1, is a, = n2"7',
n>.
Each of our next two examples deals with a second-order relation.
Solve the recurrence relation
EXAMPLE 10.34
An+2 — 4dn41 + 3a, = —200, n > 0, ao = 3000, a, = 3300.
10.3. The Nonhomogeneous Recurrence Relation 477
Here a = ¢,(3") +¢2.(1") = ¢1(3") + cp. Since f(n) = —200 = —200(1") is a solution
of the associated homogeneous relation, here al? — An for some constant A. This leads
us to
A(n + 2) — 4A(n + 1) +3An = —200, so —2A = —200, A = 100.
Hence a, = c)(3") +c¢2 + 100n. With ay = 3000 and a, = 3300, we have a, =
100(3”) + 2900 + 100n, n > 0.
Before proceeding any further, a point needs to be made about the role of technology in
solving recurrence relations. When a computer algebra system is available, we are spared
much of the drudgery of computation. Consequently, all our effort can be directed to analyz-
ing the situation at hand and setting up the recurrence relation with its initial condition(s).
Once this is done our job is just about finished. A line or two of code will often do the trick!
For example, the Maple code in Fig. 10.14 shows how one can readily solve the recurrence
relations of Examples 10.33 and 10.34.
- > rsolve({a(n+1)=2%*a(n)+24n,a(1)=1},a(n));
2" 2 ,
—~—+] 74742
L 2 2 2
> simplify (%);
(n-1)
2 n
> rsolve({a(n+2)=4*a(n+1)+3*a(n) =-200,a(0)=3000,a(1)=3300},a(n));
100 3”+ 2900 + 100 n
Figure 10.14
In part (a) of Fig. 10.15 we have an iterative algorithm (written as a pseudocode procedure)
EXAMPLE 10.35
for computing the nth Fibonacci number, for n > 0. Here the input is a nonnegative integer
n and the output is the Fibonacci number F,,. The variables i, fib, last, next_to_last, and
temp are integer variables. In this algorithm we calculate F,, (in this case for n > Q) by first
assigning or computing all of the previous values Fo, F\, F2,..., F,—1. Here the number
of additions needed to determine F,, is 0 for n = 0, 1 and n — 1 (within the for loop) for
n> 2,
Part (b) of Fig. 10.15 provides a pseudocode procedure to implement a recursive algo-
rithm for calculating F,, for n € N. Here the variable fib is likewise an integer variable. For
this procedure we wish to determine a,,, the number of additions performed in computing
F,,n > 0. We find that dg = 0, a; = 0, and from the shaded line in the procedure — namely,
fib := FibNum2(n-1) + FibNum2(n - 2) (*)
we obtain the nonhomogeneous recurrence relation
Qn = An—| + an-2 + 1, n> 2,
where the summand of | is due to the addition in Eq. (*).
478 Chapter 10 Recurrence Relations
procedure FibNumi(n: nonnegative integer)
begin
if n= 0 then
fib :=
elseifn=1then
fib :=
else
begin
last :=1
next_to last :=0
for i:=2 tondo
begin
temp := last
last := last +next_to last
next to last := temp
end
fib :=last
end
end (a)
procedure FibNum2(n: nonnegative integer)
begin
ifn=0 then
fib :=
else if n=1 then
fib :=
else
fib := FibNum2(n-1) + FibNum2(n - 2)
end (b)
Figure 10.15
Here we find that a” = c, (i4v5)" + C2 (Ke5)" and that a\”= A, a constant. Upon
substituting a,'’(p);into the nonhomogeneous recurrence relation we find that
A=A+A+l1,
so A = —1 anda, = ce, (14x8)" + c(15)" — 1,
Since ap = 0 and a, = 0 it follows that
1 5 1-5
¢) +c2 =1 and a( SS) +0 By ax
2 2
From these equations we learn that c; = (1 + /5)/(2/5), c2 = (V5 — 1)/(2V5). There-
fore,
7 (LtNS 1+V5\"_ (1-vs\ (1-v5\"_,
" 2/5 2 2/5 2
{|
1 14/5 nt+1 1 1-5 atl ;
2 5 \ 2 a
\
cal
10.3 The Nonhomogeneous Recurrence Relation 479
As n gets larger [U1 ~— J/5)/2)"*! approaches 0 since |(1 — J5)/2| <1, and a, =
(1/V5)[(1 + /5)/2]"*! = (1 + V5)/(25))(Cd + V5) /2)".
Consequently, we can see that, as the value of 7 increases, the first procedure requires
far less computation than the second one does.
We now summarize and extend the solution techniques already discussed in Examples
10.26 through 10.35.
Given a linear nonhomogeneous recurrence relation (with constant coefficients) of the
form Coan + Can—| + CrQn—2 +--+ + Cyan_x = f (n), where Co Oand C, ¥ 0, leta”
denote the homogeneous part of the solution ay.
1) If f(@) is a constant multiple of one of the forms in the first column of Table 10.2
and is not a solution of the associated homogeneous relation, then a? has the form
shown in the second column of Table 10.2. (Here A, B, Ap, Ai, A2,..., Ar—1, Ay
are constants determined by substituting a”) into the given relation; ¢, r, and @ are
also constants.)
Table 10.2
al?)
c, a constant A, a constant
n Ain + Ag
n° Ajn* + Ayn + Ao
nteZ Ayn
+ Ayn +--+ Ayn + Ag
r" reR Ar”
sin 6n AsinO@n+ Bcos@n
cos On Asin@n + B cos @n
nir? r"(A,n' + A,-\n'7! +.---+Ajn+ Ag)
r” sin On Ar” sin6@n + Br" cos én
r™ cos @n Ar” sin@n + Br" cos @n
2) When f(x) comprises a sum of constant multiples of terms such as those in the
first column of the table for item (1), and none of these terms is a solution of the
associated homogeneous relation, then al? ) is made up of the sum of the corresponding
terms in the column headed by a\?), For example, if f(m) = n*? +3 sin 2n and no
summand of f (7) is a solution of the associated homogeneous relation, then a\” y=
(Ann? + Ain + Ag) + (A sin 2n + B cos 2n).
3) Things get trickier if a summand f}() of f (7) is a solution of the associated homo-
geneous relation. This happens, for example, when f (n) contains summands such as
cr” or (c; + c2n)r” and r is a characteristic root. If f|(n) causes this problem, we
multiply the trial solution (al? 1 corresponding to f;(#) by the smallest power of n,
say n°, for which no summand of n* f; (7) is a solution of the associated homogeneous
relation. Then n* (ap”’), is the corresponding part of at”.
In order to check some of our preceding remarks on particular solutions for nonhomo-
geneous recurrence relations, the next application provides us with a situation that can be
solved in more than one way.
480 Chapter 10 Recurrence Relations
For n > 2, suppose that there are n people at a party and that each of these people shakes
EXAMPLE 10.36
hands (exactly one time) with all of the other people there (and no one shakes hands with
himself or herself). If a, counts the total number of handshakes, then
GAn+1 =a, +h, n> 2, a2 = l, (3)
because when the (x + 1)st person arrives, he or she will shake hands with the n other
people who have already arrived.
According to the results in Table 10.2, we might think that the trial (particular) solution
for Eq. (3) is Ain + Ao, for constants Ap and Aj. But here the associated homogeneous
relation is Gn41 = Gn, OF Gn41 — Gn = 0, for which a = c(1") = c, where c denotes an
arbitrary constant. Therefore, the summand Ao (in Ayn + Ag) is a solution of the associated
homogeneous relation. Consequently, the third remark (given with Table 10.2) tells us that
we must multiply A; + Ag by the smallest power of n for which we no longer have any
constant summand. This is accomplished by multiplying A,n + Ag by n', and so we find
here that
a”? = Ayn? + Agn.
When we substitute this result into Eq. (3) we have
Ay(n + 1)’ + Ao(n + 1) = Ain? + Aon +7,
or Ayn? + (2A, + Ao)n + (A, + Ag) = Ayn? + (Ag + Dn.
By comparing the coefficients on like powers of n we find that
(n?): Ay = Ay;
(n): 2A, + Ap = Ap + 1; and
(n°): A, + Ay = 0.
Consequently, Ay = 1/2 and Ap = —1/2, so ay” = (1/2)n? + (-1/2)n and a, =a” +
ai? = c+ (1/2)(a)(n — 1). Since a2 = 1, it follows from 1 = az = c + (1/2)(2)(1) that
c= 0, anda, = (1/2)(n)(n — 1), forn > 2.
We can also obtain this result by considering the n people in the room and realizing that
each possible handshake corresponds with a selection of size 2 from this set of size n — and
there are (3) = (n!)/(2!(n — 2)!) = (1/2)(n)(n — 1) such selections. [Or we can consider
the n people as vertices of an undirected graph (with no loops) where an edge corresponds
with a handshake. Our answer is then the number of edges in the complete graph K,,, and
there are (5) = (1/2)(n)(n — 1) such edges.]
Our last example further demonstrates how we may use the results in Table 10.2.
EXAMPLE 10.37 | a) Consider the nonhomogeneous recurrence relation
An42 — 10an41 + 2la, = f(n), n= 0.
Here the homogeneous part of the solution is
an” = cB") +02(7"),
for arbitrary constants c), ¢>.
In Table 10.3 we list the form for the particular solution for certain choices of f(n).
Here the values of the 11 constants A,, for 0 <i < 10, are determined by substituting
a,?? into the given nonhomogeneous recurrence relation.
10.3 The Nonhomogeneous Recurrence Relation 481
Table 10.3
f(n) an
5 Ao
3n? —2 Ajn* + Ann + Ay
7(11") Aq(11")
31(r"), r # 3,7 As(r”)
6(3") Agn3"
2(3”) — 8(9") Aqn3" + Ag(9")
4(3") + 3(7") Agn3" + Ajgn7"
b) The homogeneous component of the solution for
Ayn + 4ay-) + 4a,-2 = fn), n>2
1S
ay = c1(—2)" + en(—2)",
where c), c2 denote arbitrary constants. Consequently,
1) if f(n) = 5(—2)”, then al” = An2(—2)";
2) if f(n) = 7n(—2)", then al” = n2(—2)"(Aqn + Ay); and
3) if f(n) = —11n2(—2)", then al” = n2(—2)"( Ban? + Bin + Bo). oy
(Here, the constants A, Ag, Ai, Bo, B,, and B are determined by substituting a, Pp
into the given nonhomogeneous recurrence relation.)
5. Solve the following recurrence relations.
4) Gn42 + 3dny1 + 2aq = 3", n>0, a =0, a, =1
1. Solve each of the following recurrence relations. b) Gyi2 + 4an41 + 4a, = 7, n>0, ag=1, a, =2
a) Gn41 —@, = 2n +3, n>O0, a=] 6. Solve the recurrence relation a@,4. — 64,4; + 9a, =
b) Qna) —@, =3n?-—n, n>O0, a =3 3(2") + 7(3"), where n > 0 and ay = 1, a) = 4.
©) Gnu: — 2, =5, n>0, ay =i 7. Find the general solution for the recurrence relation
An+3 — 3@n42 + 3Qn4) — Gp =34+5n,n > 0.
d) a,4; — 2a, = 2", n>O0, a=!
8. Determine the number of n-digit quaternary (0, 1, 2, 3)
2. Use a recurrence relation to derive the formula for }>" yi”. sequences in which there is never a 3 anywhere to the right
of a0.
3. a) Let n lines be drawn in the plane such that each line 9, Meredith borrows $2500, at 12% compounded monthly, to
intersects every other line but no three lines are ever co- buy a computer. If the loan is to be paid back over two years,
incident. For n > 0, let a, count the number of regions into what is his monthly payment?
ich the plane isseparated by the ” lines. Find and sol . .
which the plane is separated bythe 7 lines. Find and solve 10. The general solution of the recurrence relation a,4. +
a recurrence relation for a,.
b1An41 + Dod, = b3n + by, n > 0, with b, constant for 1 <i <
b) For the situation in part (a), let b, count the number 4, is c,;2” + 623" +n — 7. Find b, foreach 1 <i <4.
of infinite regions that result. Find and solve a recurrence
11. Solve the following recurrence relations.
relation for b,.
a) ae .5 — 5a? , + 6a? =Tn, n>O, a=a,=1
4. On the first day of a new year, Joseph deposits $1000 in
an account that pays 6% interest compounded monthly. At the b) a? —2a,-;=0, n>=1, ay=2 (Let b, = log, dy,
beginning of each month he adds $200 to his account. If he n> 0.)
continues to do this for the next four years (so that he makes 12. Let © = {0, 1, 2, 3}. Forn > 1, let a, count the number of
47 additional deposits of $200), how much will his account be strings in &” containing an odd number of 1’s. Find and solve
worth exactly four years after he opened it? a recurrence relation for a,.
482 Chapter 10 Recurrence Relations
oO
(n= 1) (n = 2)
Figure 10.16
13. a) For the binary string 001110, there are three runs: 00, mula given in Example 4.5 or with the result requested in
111, and 0. Meanwhile, the string 000111 has only two part (b) of Exercise 8 of Section 9.5.]
runs: 000 and 111; while the string 010101 determines the b) In an organic laboratory, Kelsey synthesizes a crys-
six runs: 0, 1,0, 1, 0, 1. Form = 1, we consider two binary talline structure that is made up of 10,000,000 triangular
strings, namely, 0 and 1— these two strings (of length 1) layers of atoms. The first layer of the structure has one
determine a total of two runs. There are four binary strings atom, the second layer has three atoms, and, in general, the
of length n = 2 and these strings determine 1 (for 00) + 2 nth layer has 1+2+---+n =, atoms. (Consider each
(for 01) + 2 (for 10) + 1 (for 11) = 6 runs. Find and solve layer, other than the last, as if it were placed upon the spaces
a recurrence relation for t,, the total number of runs deter- that result among the neighboring atoms of the succeeding
mined by the 2” binary strings of length n, where n > 1. layer. See Fig. 10.16.) (i) How many atoms are there in
b) Answer the question posed in part (a) for quaternary one of these crystalline structures? (ii) How many atoms
strings of length n. (Here the alphabet comprises 0, 1, 2, 3.) are packed (strictly) between the 10,000th and 100,000th
c) Generalize the results of parts (a) and (b). layer?
14, a) For 2 > 1, the ath triangular number t, is defined by 15. Write a computer program (or develop an algorithm) to
t =1+2+---+n=xn(n+4+ 1)/2. Find and solve a re- solve the problem of the Towers of Hanoi. For n € Z*, the pro-
currence relation for s,,” > 1, wheres, = ¢t, +t +---+ gram should provide the necessary steps for transferring the n
t,, the sum of the first n triangular numbers. [The reader disks from peg | to peg 3 under the restrictions specified in
may wish to compare the result obtained here with the for- Example 10.28.
10.4
The Method of Generating Functions
With all the different cases we had to consider for the nonhomogeneous linear recurrence
relation, we now get some assistance from the generating function. This technique will find
both the homogeneous and the particular solutions for a,, and it will incorporate the given
initial conditions as well. Furthermore, we’ll be able to do even more with this method.
We demonstrate the method in the following examples.
Solve the relation a, — 3a,-; =n, n> l,ay = 1.
EXAMPLE 10.38
This relation represents an infinite set of equations:
(n = 1) a, — 3a)
= 1
(n = 2) a) — 3a, = 2
10.4 The Method of Generating Functions 483
Multiplying the first of these equations by x, the second by x”, and so on, we obtain
(n = 1) a,x! — 3agx! = 1x!
(n = 2) ayx* — 3a,x* = 2x?
Adding this second set of equations, we find that
CO OO ox
) Ay,x” —3 - nx" = ) nx", (1)
n=l n=] n=]
We want to solve for a, in terms of n. To accomplish this, let f(x) = ye nx” be
the (ordinary) generating function for the sequence ag, a), a2 . . .. Then Eq. (1) can be re-
written as
(f(x) = ay) = 3x So ag! =e nx" (= dm) (2)
n=} n=] n=0
Since \°. ay_yx"! = ry Gnx” = f(x) and ay = I, the left-hand side of Eq. (2)
becomes (f(x) — 1) — 3xf (x).
Before we can proceed, we need the generating function for the sequence 0, 1, 2,3,....
Recall from part (c) of Example 9.5 that
(owe ETO F3 He, so
.
(f(x) - 1) -3xf(x%)
.
= G_xi
_ x
ae and f(x) =
(1 — 3x) + (x2 a 23a):
Using a partial fraction decomposition, we find that
x _ A B C
(l—x)2(1-—3x) (-—x) + Gx 1G — 3x)’
or
x = A(1 —x)(Q — 3x) + BU -— 3x) +C(1 — x)’.
From the following assignments for x, we get
(x= 1): 1= B(-2), B=-3.
] 1 2\° 3
(x=3); 5-¢(5), C=7
(x=0): O=A+B4+C, A=-(B+O)=-5.
Therefore,
dt (-1/4) (-1/2) (3/4)
a or er (i —x)2 * ( —3x)
_ (7/4) (-1/4) 1 (—1/2)
(l—3x) (—-x) (l—x)"
We find a, by determining the coefficient of x” in each of the three summands.
484 Chapter 10 Recurrence Relations
a) (7/4)/( — 3x) = 7/4) 1/(1 — 3x)]
= (7/4)[1 + Gx) + Gx)? + x)? +---], and the coefficient of
x" is (7/4)3".
b) (—1/4/( —x) =(-1/M[1+x«+x*+---], and the coefficient of x” here is
(—1/4).
ce) (-1/2)/(1 — x)? = (-1/2)(1 — x)?
= (1 [(8) + a0 + (Ae? + (O04
and the coefficient of x” is given by (—1/2)(7)(-1)" = (—1/2)(—1)"(? +2 - ') .
(—1)" = (-1/2)(n + 1).
Therefore a, = (7/4)3" — (1/2)n ~ (3/4), n = 0. (Note that there is no special concern
here with a\”’. Also, the same answer is obtained by using the techniques of Section 10.3.)
In our next example we extend what we learned in Example 10.38 to a second-order
relation. This time we present the solution within a list of instructions one can follow in
order to apply the generating-function method.
Consider the recurrence relation
EXAMPLE 10.39
Gn42 — 5an41 + 64, = 2, n>Q, do = 3, a,=7.
1) We first multiply this given relation by x”** because n + 2 is the largest subscript
that appears. This gives us
Any xt? — Sang xt? + Oayx"t? = 2x"??,
2) Then we sum all of the equations represented by the result in step (1) and obtain
oo oC oO
) “Ansgx"? —5 y (Anyi x"? +6 » “Anx"*? =9 >> tt?
n=0 n=0 n=0 =
3) In order to have each of the subscripts on a match the corresponding exponent on x,
we rewrite the equation in step (2) as
oC CO x x
) dng xt? — 5x ) An) x"t! + 6x? ) Anx" = 2x? ) x",
n=0 n=0 n=0 n=0
Here we also rewrite the power series on the right-hand side of the equation in a form
that will permit us to use what we learned in Section 2 of Chapter 9.
4) Let f(x) = 3°%5 anx” be the generating function for the solutton>The equation in
step (3) now takes the form
2
(f (x) — ay — ax) — Sx( f(x) — ap) + 6x? f(x) = 3
1-—x
or
2x?
(f (x) —3 — 7x) — Sx( f(x) — 3) + 6x7 f(x) = l-x
10.4 The Method of Generating Functions 485
5) Solving for f(x) we have
2x? — 3-1lx+ 10x?
(1 — 5x + 6x") f(x) = 3 ~ 8x + = 3
—X l—-x
from which it follows that
3 — 11x + 10x? _ (3 — 5x)(1 — 2x) _ 3 — 35x
fQ@)=
(1 — 5x + 6x7)(1 — x) (l—3x)(1—2x)1—-x) QU~-3x)0—x)
A partial-fraction decomposition (by hand, or via a computer algebra system)
gives us
2 1 = h = h
f@)= Toax Toe 76 +2o
Consequently, a, = 2(3") + 1,n>0.
We consider a third example, which has a familiar result.
EXAMPLE 10.40 LetneN. For r > 0, let a(n, r) = the number of ways we can select, with repetitions
allowed, r objects from a set of n distinct objects.
Forn > 1, let {b1, b2, ... , b,} be the set of these objects, and consider object b,. Exactly
one of two things can happen.
a) The object b; is never selected. Hence the r objects are selected from {b2,..., by}.
This we can do in a(n — 1, r) ways.
b) The object b; is selected at least once. Then we must select r — 1 objects from
{b), b2,..., bn}, SO we can continue to select b; in addition to the one selection
of it we’ ve already made. There are a(n, r — 1) ways to accomplish this.
Then a(n, r) = a(n — 1, r) + a(n, r — 1) because these two cases cover all possibilities
and are mutually disjoint.
Let fr = an a(n, r)x" be the generating function for the sequence a(n, 0), a(n, 1),
a(n, 2),.... [Here f, is an abbreviation for f,(x).] From a(n,r)=a(n—-1,r)+
a(n,r — 1), where n > 1 andr > 1, it follows that
a(n, r)x’ =a(n—1,r)x" +a(n,r—1)x" and
yan. rx" = S*aln ~ rx" + Sain, r—1)x".
r=] r=l r=1
Realizing that a(n, 0) = 1 forn > 0 and a(0, r) = 0 for r > 0, we write
fn — a(n, 0) = fr-1 -aQn— 1,0) +x So ay r = 1x",
r=]
SO fp —1= fn-1 —14+xf,. Therefore, f, —xfn = fr-1,0r fn = fn-i/(1 — x).
If n = 5, for example, then
Fa 1 FB fs hr fi
fs=
G—-x) (-x) G-x) (G-x? (—-»? Gx!
fo _ 1
~ d=x) (=x)
since fy = a(0, 0) + a(0, 1)x +. a(O, 2)x74+---=14+04+04+---.
486 Chapter 10 Recurrence Relations
In general, f, = 1/(1 — x)" = (1 — x)", soa(n, r) is the coefficient of x’in (1 — x)™,
which is (5")(—1)" = ("*77').
{Here we dealt with a recurrence relation for a(n, r), a discrete function of the two
(integer) variables n, r > 0.]
Our last example shows how generating functions may be used to solve a system of
recurrence relations.
This example provides an approximate model for the propagation of high- and low-energy
| EXAMPLE 10.41 | neutrons as they strike the nuclei of fissionable material (such as uranium) and are absorbed.
Here we deal with a fast reactor where there is no moderator (such as water). (In reality,
all the neutrons have fairly high energy and there are not just two energy levels. There is a
continuous spectrum of energy levels, and these neutrons at the upper end of the spectrum
are called the high-energy neutrons. The higher-energy neutrons tend to produce more new
neutrons than the lower-energy ones.)
Consider the reactor at time 0 and suppose one high-energy neutron is injected into the
system. During each time interval thereafter (about 1 microsecond, or 107° second) the
following events occur.
a) When a high-energy neutron interacts with a nucleus (of fissionable material), upon
absorption this results (one microsecond later) in two new high-energy neutrons and
one low-energy one.
b) For interactions involving a low-energy neutron, only one neutron of each energy level
is produced.
Assuming that all free neutrons interact with nuclei one microsecond after their creation,
find functions of n such that
a, = the number of high-energy neutrons,
b, = the number of low-energy neutrons,
in the reactor after n microseconds, n > 0.
Here we have ay = 1, bg = 0 and the system of recurrence relations
ant) = 24y + bn (3)
Dn4t = dy + bn. (4)
Let f(x) = oy anx", g(x) = yy bax” be the generating functions for the se-
quences {a,|n > 0}, {b,|n > 0}, respectively. From Eqs. (3) and (4), when n > 0
Ana x"! _— 2a,x"! + b,x"?! By
byw x"t! = apx"t! + byxtt}, (4)'
Summing Eq. (3)’ over all n > 0, we have
oO x oC
s- Any xr =2x > Anx"” +x > b,x". (3)"
n=() n=(0 n=0
In similar fashion, Eq. (4)’ yields
x oC aC
- byyix"t! =x > nx” +x > b,x". (4)”
n=0 n=0 n=0
10.5 A Special Kind of Nonlinear Recurrence Relation (Optional) 487
Introducing the generating functions at this point, we get
F(x) — ao = 2xf (x) + x8(X) (3)"
B(x) — bo = xf (x) + xg(X), (4)"
a system of equations relating the generating functions. Solving this system, we find that
oo lax 5405 1 5— 4/5 1
f= aa =( 10 \(S)+( 10 i) and
«) = x _ oo) 1 \, —5 + 3/5 1 )
6) == -( 10 (—) ( 10 \Gs
where
V =
34+ 5 , é =
3 — V5
2 2
Consequently,
an = + and
10 2 10 2
by = en + TO 4 nr => 0.
10 2 10 2
O<r<n. Here a(n, r) = 0 when r > nxn. Use the recurrence
EXERCISES 10.4 relation a(n, r) =a(n —1,r—1)+a(n—1,r), wheren > 1
; . ; and r > 1, to show that f(x) = (1 +x)” generates a(n, r),
1. Solve the following recurrence relations by the method of r>0
generating functions. —
3. Solve the following systems of recurrence relations.
a) Qn41 — ay = 3’, n>=0, ay = 1
a) Qn41 = —2a,, ~ Ab,,
_— = 72 =
b) an+1 ay nr, n= 0), ag ] Dnt _ 4a, + 6b,
C) Qn42
— 3@n41 + 2a, = 0, n=O, ao = 1, a, =6 n>0, ao=1, by =0
d) Gn42 — 2An+1 + Ay =2", n= 0, ao = 1, a =2 b) Gn+) = 2a, — by +2
2. Forn distinct objects, let a(n, r) denote the number of ways bay) = —@, + 2b, — 1
we can select, without repetition, r of the 2 objects when n>0, a,=0, bo =1
10.5
A Special Kind of Nonlinear
Recurrence Relation (Optional)
Thus far our study of recurrence relations has dealt with linear relations with constant
coefficients. The study of nonlinear recurrence relations and of relations with variable
coefficients is not a topic we shall pursue except for one special nonlinear relation that
lends itself to the method of generating functions.
We shall develop the method in a counting problem on data structures. Before do-
ing so, however, we first observe that if f(x) = an a;x' is the generating function
for ay, a, @2,..., then [f(x)]? generates apap, aoa + 41a, doa2 + aa; + 42d0,...,
488 Chapter 10 Recurrence Relations
QAn + An) + Q2Gn-2 + +++ + aya) + anao,..., the convolution of the sequence
dy, 41, 42, ..., with itself.
In Sections 3.4 and 5.1, we encountered the idea of a tree diagram. In general, a tree is
EXAMPLE 10.42
an undirected graph that is connected and has no loops or cycles. Here we examine rooted
binary trees.
In Fig. 10.17 we see two such trees, where the circled vertex denotes the root. These trees
are called binary because from each vertex there are at most two edges (called branches)
descending (since a rooted tree is a directed graph) from that vertex.
In particular, these rooted binary trees are ordered in the sense that a left branch descend-
ing from a vertex is considered different from a right branch descending from that vertex.
For the case of three vertices, the five possible ordered rooted binary trees are shown in
Fig. 10.18. (If no attention were paid to order, then the last four rooted trees would be the
same structure.}
A
(1) (2) (3) (4) (5)
Figure 10.17 Figure 10.18
Our objective is to count, for n > 0, the number b,, of rooted ordered binary trees on n
vertices. Assuming that we know the values of b; for 0 <i <n, in order to obtain b,,) we
select one vertex as the root and note, as in Fig. 10.19, that the substructures descending on
the left and right of the root are smaller (rooted ordered binary) trees whose total number of
vertices is n. These smaller trees are called subtrees of the given tree. Among these possible
subtrees is the empty subtree, of which there is only 1 (= bo).
Left Right
subtree subtree
Figure 10.19
Now consider how the n vertices in these two subtrees can be divided up.
10.5 A Special Kind of Nonlinear Recurrence Relation (Optional) 489
(1) 0 vertices on the left, m vertices on the right. This results in bp), overall sub-
structures to be counted in b, 41.
(2) 1 vertex on the left, n — 1 vertices on the right, yielding b;b,_, rooted ordered
binary trees on n + | vertices.
(i + 1) i vertices on the left, n — i on the right, for a count of b;b,_; toward by+1.
(n + 1) n vertices on the left and none on the right, contributing 5,,b9 of the trees.
Hence, for all n > 0,
basi = bobn + bybn-1 + baby-2 ++ ++ + bn_1b1 + dn bo,
and
Cw oO
De oni t= YO Gobn + bibn-1 +++ + Bnd + Bnbo)x"*. (1)
n=0 n=0
Now let f(x) = )°~, b,x" be the generating function for bp, 5, b2,.... We rewrite
Eq. (1) as
Cf (x) = bo) = x Do (Boba + biDn-1 +++ + + baby)x” = xLf OP.
n=0
This brings us to the quadratic [in f(x)]
x{fa@)P ~ f(x)+1=0, so fi) =[lt+v1—4x]/(x).
But /1 —4x = (1 — 4x)! = (‘.") + (1(?) (4x) + (12?) (4x)? 4... , where the
coefficient of x”, > 1, is
(122)
1/2 ap = CC (1/2) =— 2)---
1/2)(11/2) — 1)((1/2) 2) (1/2)
(U/)—H - +N ] 4,
n n!
_ (-1ye-t§
1/2)(1/2)(3/2)
/ ¢ / )¢ / -
--- (( (Qn n — 3)/2
)/ (ay
_ (12"(D@G)--- Ga — 3)
n!
_ (-1)2"(n!)(1)3) «+ - Qn — 3)(2n — 1)
7 (n!)(n!)(2n — 1)
_ DQ4) ++ n)\(G)- n=) _ CD (°")
(2n — 1)(n!)(n!) (2n — 1) ,
In f (x) we select the negative radical; otherwise, we would have negative values for the
b,’s. Then
1 oe 1 2n\ ,
FO) = x phy ata (es |]
and b,,, the coefficient of x” in f (x), is half the coefficient of x”*+! in
S. 1 *) h
n=1
aol, a
490 Chapter 10 Recurrence Relations
So
b = 1 1 2(n+1)\ _ (2n)! _ 1 2n
" 2,[2@+)-1 n+] n+D!m!) (nt+)\a/
The numbers b, are called the Catalan numbers — the same sequence of numbers we en-
countered in Section 1.5. As we mentioned earlier (following Example 1.42), these numbers
are named after the Belgian mathematician Eugene Charles Catalan (1814-1894), who used
them in determining the number of ways to parenthesize the expression x|%2X3 + - + X,. The
first nine Catalan numbers are by = 1, Db} = 1, bo = 2, b3 = 5, bg = 14, bs = 42, Dg = 132,
b; = 429, and bg = 1430.
We continue now with a second application of the Catalan numbers. This is based on an
example given by Shimon Even. (See page 86 of reference [6].)
An important data structure that arises in computer science is the stack. This structure allows
EXAMPLE 10.43
the storage of data items according to the following restrictions.
1) All insertions take place at one end of the structure. This is called the top of the stack,
and the insertion process is referred to as the push procedure.
2) All deletions from the (nonempty) stack also take place from the top. We call the
deletion process the pop procedure.
Since the /ast item inserted in this structure is the first item that can then be popped out
of it, the stack is often referred to as a “last-in-first-out’” (LIFO) structure.
Intuitive models for this data structure include a pile of poker chips on a table, a stack
of trays in a cafeteria, and the discard pile used in playing certain card games. In all three
of these cases, we can only (1) insert a new entry at the top of the pile or stack or (2) take
(delete) the entry at the top of the (nonempty) pile or stack.
Here we shall use this data structure, with its push and pop procedures, to help us permute
the (ordered) list 1, 2,3,...,m, form € Z*. The diagram in Fig. 10.20 shows how each
integer of the input 1, 2, 3, ..., must be pushed onto the top of the stack in the order
given. However, we may pop an entry from the top of the (nonempty) stack at any time.
But once an entry is popped from the stack, it may not be returned to either the top of the
stack or the input left to be pushed onto the stack. The process continues until no entry is
left in the stack. Thus the ordered sequence of elements popped from the stack determines
a permutation of 1, 2, 3,..., 7.
Output wae 2,3,...,9 Input
Stack
Figure 10.20
10.5 A Special Kind of Nonlinear Recurrence Relation (Optional) 491
If n = 1, our input list consists of only the integer 1. We insert | at the top of the (empty)
stack and then pop it out. This results in the permutation 1.
For n = 2, there are two permutations possible for 1, 2, and we can get both of them
using the stack.
1) To get 1, 2 we place 1 at the top of the (empty) stack and then pop it. Then 2 is placed
at the top of the (empty) stack and it is popped.
2) The permutation 2, 1 is obtained when 1 is placed at the top of the (empty) stack and
2 is then pushed onto the top of this (nonempty) stack. Upon popping first 2 from the
top of the stack, and then 1, we obtain 2, 1.
Turning to the case where n = 3, we find that we can obtain only five of the 3! = 6
possible permutations of 1, 2, 3 in this situation. For example, the permutation 2, 3, 1
results when we take the following steps.
© Place 1 at the top of the (empty) stack.
¢ Push 2 onto the top of the stack (on top of 1).
© Pop 2 from the stack.
e Push 3 onto the top of the stack (on top of 1).
e Pop 3 from the stack.
© Pop | from the stack, leaving it empty.
The reason we fail to obtain all six permutations of 1, 2, 3 is that we cannot generate
the permutation 3, 1, 2 using the stack. For in order to have 3 in the first position of the
permutation, we must build the stack by first pushing | onto the (empty) stack, then pushing
2 onto the top of the stack (on top of 1), and finally pushing 3 onto the stack (on top of 2).
After 3 is popped from the top of the stack, we get 3 as the first number in our permutation.
But with 2 now at the top of the stack, we cannot pop | until after 2 has been popped, so
the permutation 3, 1, 2 cannot be generated.
When n = 4, there are 14 permutations of the (ordered) list 1, 2, 3, 4 that can be generated
by the stack method. We list them in the four columns of Table 10.4 according to the loca-
tion of 1 in the permutation.
Table 10.4
1, 2, 3,4 2,1,3,4 2,3,1,4 2,3,4,1
1,2, 4,3 2,1,4,3 3,2,1,4 2,4, 3,1
1,3,2,4 3,2,4,1
1,3,4,2 3,4,2,1
1,4, 3,2 4,3,2,1
1) There are five permutations with 1 in the first position, because after | is pushed onto
and popped from the stack, there are five ways to permute 2, 3, 4 using the stack.
2) When | is in the second position, 2 must be in the first position. This is because we
pushed | onto the (empty) stack, then pushed 2 on top of it and then popped 2 and
then 1. There are two permutations in column 2, because 3, 4 can be permuted in two
ways on the stack.
492 Chapter 10 Recurrence Relations
3) For column 3 we have | in position three. We note that the only numbers that can
precede it are 2 and 3, which can be permuted on the stack (with 1 on the bottom) in
two ways. Then | is popped, and we push 4 onto the (empty) stack and then pop it.
4) In the last column we obtain five permutations: After we push 1 onto the top of the
(empty) stack, there are five ways to permute 2, 3, 4 using the stack (with 1 on the
bottom). Then 1 is popped from the stack to complete the permutation.
On the basis of these observations, for 1 <i <4, let a; count the number of ways to
permute the integers 1, 2,3,..., i (or any list of 7 consecutive integers) using the stack.
Also, we define ag = 1 since there is only one way to permute nothing, using the stack.
Then
a4 = A9a3 + 1a. + 420) + 43a,
where
a) Each summand a,q, satisfies j + k = 3.
b) The subscript j tells us that there are j integers to the left of 1 in the permutation
— in
particular, for j > 1, these are the integers from 2 to j + 1, inclusive.
c) The subscript & indicates that there are & integers to the right of 1 in the permutation—
for k > 1, these are the integers from 4 — (k — 1) to 4.
This permutation problem can now be generalized to any n € N, so that
Ant) = Ayan TA, Ay) + A2An—2 +> + + Gn-14, + Ando,
with ag = 1. From the result in Example 10.42 we know that
1 (*")
ay — .
(n+1)\n
Now let us make one final observation about the permutations in Table 10.4. Consider,
for example, the permutation 3, 2, 4, 1. How did this permutation come about? First 1 is
pushed onto the empty stack. This is then followed by pushing 2 on top of | and then
pushing 3 on top of 2. Now 3 is popped from the top of the stack, leaving 2 and 1; then 2
is popped from the top of the stack, leaving just 1. At this point 4 is pushed on top of | and
then popped, leaving | on the stack. Finally, 1 is popped from the (top of the) stack, leaving
the stack empty. So the permutation 3, 2, 4, 1 comes about from the following sequence of
four pushes and four pops:
push, push, push, pop, pop, push, pop, pop.
Now replace each “push” with a “1” and each “pop” with a “0”. The result is the sequence
1 1100 1 0 0.
Similarly, the permutation 1, 3, 4, 2 is determined by the sequence
push, pop, push, push, pop, push, pop, pop
and this corresponds with the sequence
101101 0 0.
In fact, each permutation in Table 10.4 gives rise to a sequence of four |’s and four 0’s.
But there are 8!/(4! 4!) = 70 ways to list four 1’s and four 0’s. Do these 14 sequences have
some special property? Yes! As we go from left to right in each of these sequences, the
10.5 A Special Kind of Nonlinear Recurrence Relation (Optional) 493
number of 1’s (pushes) is never exceeded by the number of 0’s (pops) [just like in part (b)
of Example 1.43 — another situation counted by the Catalan numbers].
Our last example for this section is comparable to Example 10.17. Once again we see
that we must guard against trying to obtain a general result without a general argument — no
matter what a few special cases might suggest.
Here we start with n distinct objects and, for n > 1, we distribute them among at most n
EXAMPLE 10.44
identical containers, but we do not allow more than three objects in any container, and we
are not concerned about how the objects are arranged within any one container. We let a,
count the number of these distributions, and from Fig. 10.21 we see that
ay = 1, a, = 1, a) = 2, a,j=5, and a,= 14.
It appears that we might have the first five terms in the sequence of Catalan numbers.
Unfortunately, the pattern breaks down and we find, for example, that
as = 46 # 42 (the sixth Catalan number) and
dg = 166 # 132 (the seventh Catalan number).
(The distributions in this example were studied by F. L. Miksa, L. Moser, and M. Wyman
in reference [22].)
C
B B C C B
A A Aj{B A BiA A B AJC A; BIC
(n = 0) (n = 1) (n = 2) (n
= 3)
C D D D
B B C C BID C}D DIC D
AID AIC A|B BIA Alc A\B A|B AIBIC
(n = 4)
Figure 10.21
Other examples that involve the Catalan numbers can be found in the chapter references.
3. Show that for all n > 2,
EXERCISES
(” - ') _ ( = ) 1 (*")
a a}
1. For the rooted ordered binary trees of Example 10.42, a n—2 (n+ 4)
calculate by and draw all of these four-vertex structures. 4. Which of the following permutations of 1, 2, 3, 4, 5, 6, 7,8
2. Verify that for all x > 0 can be obtained using the stack (of Example 10.43)?
1 1 on +2 i an a) 4,2,3,1,5,6,7,8 b) 5, 4, 3, 6, 2, 1, 8,7
gaa) Oe) = Gas) C). ce) 4,5,3,2,1,8,6,7 d) 3,4,2,1,7,6,8,5
494 Chapter 10 Recurrence Relations
5. Suppose that the integers 1, 2, 3, 4, 5, 6, 7, 8 are permuted 2 and 5 —and the sides labeled ab, c and (ab)c provide a sec-
using the stack (of Example 10.43). (a) How many permutations ond interior triangle for this triangulation. Continuing in this
are possible? (b) How many permutations have | in position 4 way, we label the base edge connecting vertices 1 and 2 with
and 5 in position 8? (c) How many permutations have 1 in po- ((ab)c)d — one of the five ways we can introduce parentheses
sition 6? (d) How many permutations start with 321? in order to obtain the three products (of two numbers at a time)
needed to compute abcd. The triangulation in part (11) of the
6. This exercise deals with a problem that was first proposed
figure corresponds with the parenthesized product (ab) (cd).
by Leonard Euler. The problem examines a given convex poly-
gon of n (= 3) sides — that is, a polygon of » sides that satisfies a) Determine the parenthesized product involving a, b, c,
the property: For all points P,;, P, within the interior of the d for the other three triangulations of the convex pentagon.
polygon, the line segment joining P; and P, also lies within b) Find the parenthesized product for each of the triangu-
the interior of the polygon. Given a convex polygon of 7 sides, lated convex hexagons in parts (iii) and (iv) of Fig. 10.22
Euler wanted to count the number of ways the interior of the [From part (a) we learn that there are five ways to parenthesize
polygon could be triangulated (subdivided into triangles) by the expression abcd (and five ways to triangulate a convex pen-
drawing diagonals that do not intersect. tagon). Part (b) shows us two of the 14 ways one can introduce
For a convex polygon of n > 3 sides, let f, count the num- parentheses for the expression abcde (and triangulate a convex
ber of ways the interior of the polygon can be triangulated by hexagon). In general, there are aeT (2") ways to parenthesize the
drawing nonintersecting diagonals. expression xX1X2X3 ++ + X_—1X_Xn_41- It was in solving this prob-
a) Define t. = 1 and verify that lem that Eugéne Charles Catalan discovered the sequence that
now bears his name.}
trot = tot, + taty-1 Fo + hy-18g + tno.
b) Express f, as a function of n. 8. Forn > 0,
7. In Fig. 10.22 we have two of the five ways in which we can
triangulate the interior of a convex pentagon with no intersect- mn = (= ) (7)
ing diagonals. Here we have labeled four of the sides — with is the nth Catalan number.
the letters a, b, c, d—as well as the five vertices. In part (i)
a) Show that for all n > 0,
we use the labels on sides a and b to give us the label ab on
2(2n + 1)
the diagonal connecting vertices 2 and 4. This is because this bp = ————
diagonal (labeled ab), together with the sides a and b, provides (n + 2)
us with one of the interior triangles for this triangulation of b) Use the result of part (a) to write a computer program
the convex pentagon. Then the diagonal ab and the side c give (or develop an algorithm) that calculates the first 15 Catalan
rise to the label (ab)c on the diagonal determined by vertices numbers.
9. Forn > 0, evenly distribute 2” points on the circumference
of a circle, and label these points cyclically with the integers
1,2,3,..., 22. Let a, be the number of ways in which these
2n points can be paired off as n chords where no two chords
intersect. (The case for n = 3 is shown in Fig. 10.23.) Find and
solve a recurrence relation for a,, 2 > 0.
10. For n EN, consider all paths from (0, 0) to (2n, 0) us-
ing the moves N: (x, y) > (w+ 1, ¥4+1) and Si: (xy, y)>
2 (ab)dd 1 2 (ab\(cd) 1 (x + 1, y— 1), where any such path can never fall below the
x-axis. The five paths (generally called mountain ranges) for
(i) (ii)
n = 3 are shown in Fig. 10.24. How many mountain ranges are
there for each n € N? (Verify your claim!)
11. Forn € Z*, let f: {1, 2,...,n}— {1, 2,..., }, wheref
is monotone increasing [that is, ] <i < j<n=> f(i) < f(j)]
and f(i) >i for all 1 <i <n. (a) Determine the five mono-
tone increasing functions f:{1, 2,3}— {1, 2,3}, where
f(i) =i for all | <i <3. (b) Use the graphs of the func-
tions from part (a) to set up a one-to-one correspon-
dence with the paths from (0, 0) to (3, 3) using the moves
(iii) (iv)
R: (x, y) > («+ 1, y), U: (%, y) > (x, y+ 1), where each
Figure 10.22 such path never falls below the line y = x. (The reader may
10.5 A Special Kind of Nonlinear Recurrence Relation (Optional) 495
2 2
1 3] 1 3 | 1
6 4 | 6 416
5 5
Figure 10.23
<n.] (c) How many functions g have domain and codomain
y y
equal to {1, 2, 3,....n}, forn €Z*, and satisfy g(i) <i for
3 3 alll <i<n?
2 2
13. For n €N, consider the arrangements of pennies built on a
] |
contiguous row of m pennies. Each penny that is not in the bot-
x
123456 123456 tom row (of # pennies) rests upon the two pennies below it, and
there is no concern about whether heads or tails appears. The
y y situation for x = 3 is shown in Fig. 10.25. How many such ar-
3 3 rangements are there for a contiguous row of n pennies, n € N?
2 2 14. Forn EN, let s, count the number of ways one can travel
1 1 from (0, 0) to (n, n) using the moves R: (x, y) > (x + 1, y),
x
123456 123456 U: (x, y) > (x, y+ 1),D: (, y) > (« +1, y +1), where the
path can never rise above the line y = x. (a) Determine 5).
y (b) How is s> related to the Catalan numbers bo, 8, b2? (c) How
3 is $3 related to bo, b|, bo, b3? What is 53? (d) For n € N, how
2 is s, related to bo, by, bo, ..., b,? (The numbers so, 5), 52, ...
1 are known as the Schréder numbers.)
x
15. A one-to-one function f/f: {1,2,3,...,a}— {], 2, 3,
123456
_n} is often called a permutation. Such a permutation is
Figure 10.24 termed a rise/fall permutation when f (1) < f(2), f(2) > f(3),
FQ) < f(®,.... For example, ifn = 4 the five permutations
wish to check Exercise 3 for Section 1.5.) (c) If the paths in 1324 (where f(1) = 1, f(2) = 3, f(3) =2, f(*) =, 1423,
part (b) are rotated clockwise through 45°, what results do we 2314, 2413, and 3412 are the rise/fall permutations (for 1, 2,
find? (d) How many monotone increasing functions f have do- 3, 4). This we denote by writing E, = 5, where, in general, EF,
main and codomain equal to {1, 2,3, ..., a}, forn € Z*, and counts the number of rise/fall permutations for 1, 2,3,..., 7.
satisfy f(@@) > i forall1 <i <n? The numbers Ey, £,, E>, E3,... are called the Euler numbers
(not to be confused with the Eulerian numbers in Example 4.21).
12. For ne Z*, let g:{1,2,...,n}— {1,2,....n}, where We define Ey = 1 and find that £) = 1, Ey = 1.
efi) <i for all 1<i<n. (a) Determine the five func-
a) Find the rise/fall permutations for 1, 2, 3. What is £3?
tions g: {1, 2, 3} > {1, 2, 3} where g(f) <i for all 1 <7 <3.
(b) Set up a one-to-one correspondence between the functions b) Find the rise/fall permutations for 1, 2, 3, 4, 5. What
in part (a) here and those in part (a) of the previous exer- is Es?
cise. [You want a one-to-one correspondence that will gener- c) Explain why in each rise/fall permutation of 1, 2,
alize when you examine the functions f, g: {1,2,...,n}—> 3,...n, we find n at position 27 for some | <i < |[n/2],
{1,2,...,n},n © Z*, where f(i) >i and g(i) <i forall 1 <i ifn > 1.
Figure 10.25
496 Chapter 10 Recurrence Relations
d) Forz > 2, show that g) Prove that for n > 2,
l/2l py
Ey, = Ye (Fo )BotBnan
r=1
Fy, En-1, Eo Bo = Ey = 1. E,
(3) U ("7 |)Bte
\
-{-
i m-i
Boe
E,E,,-1. Ey = E, =1.
:=0
e) Where do we find 1 in a rise/fall permutation of h) Use the result in part (g) to find Fs and £7.
1,2,3,...,n?
i) Find the Maclaurin series expansion for f (x) = sec x +
{) Forn > 1, show that tan x. Conjecture (no proof required) the sequence for
which this is the exponential generating function.
E, = ‘SS (", ') En-y-1, Ey = 1.
10.6
Divide-and-Conquer
Algorithms (Optional)*
One of the most important and widely applicable types of efficient algorithms is based on
a divide-and-conquer approach. Here the strategy, in general, is to solve a given problem
of size n (n € Z*) by
1) Solving the problem for a small value of n directly (this provides an initial condition
for the resulting recurrence relation).
2) Breaking the general problem of size n into a smaller problems of the same type
and (approximately) the same size— either [n/b] or |[n/b],* where a, b € Z* with
l<a<nand!<b<n.
Then we solve the a smaller problems and use their solutions to construct a solution for the
original problem of size n. We shall be especially interested in cases where n is a power of
b, and b = 2.
We shall study those divide-and-conquer algorithms where
1) The time to solve the initial problem of size n = 1 is a constant c > 0, and
2) The time to break the given problem of size n into a smaller (similar) problems,
together with the time to combine the solutions of these smaller problems to get a
solution for the given problem, is h(n), a function of n.
Our concern here will actually be with the time-complexity function f(n) for these
algorithms. Consequently, we shall use the notation f(n) here, instead of the subscripted
notation a, that we used in the earlier sections of this chapter.
The conditions that have now been stated lead to the following recurrence relation.
fQ) =e,
f(n) =af(n/b) thin), — forn=be, ok > 1.
We note that the domain of f is {1, b, b?, b?,...} = {b'|i EN} CZ.
‘The material in this section may be skipped with no loss of continuity. It will be used in Section 12.3
to determine the time-complexity function for the merge sort algorithm. However, the result there will also be
obtained for a special case of the merge sort by another method that does not use the material developed in this
section.
For each x € R, recall that [x] denotes the ceiling of x and |x| the floor of x, or greatest integer in x, where
a) [x] = [x] =x, forx €Z.
b) |x] = the integer directly to the left of x, forx ¢ R — Z.
c) [x] = the integer directly to the right of x, for x € R — Z.
10.6 Divide-and-Conquer Algorithms (Optional) 497
In our first result, the solution of this recurrence relation is derived for the case where
h(n) is the constant c.
THEOREM 10.1 Leta, b,c € Z* with
b > 2, and let f:Z* > R. If
fd)=c, and
f(n) = af (n/b) +c, forn = b*, k>1,
then for all n = 1, b, b?, b?,...,
1) f(@) = c(log, n + 1), whena = 1, and
log,a __ l
2) f(n) = a. when a > 2.
a —
Proof: For k > 1 and n = b*, we write the following system of k equations. [Starting with
the second equation, we obtain each of these equations from its immediate predecessor by
(i) replacing each occurrence of n in the prior equation by n/b and (1i) multiplying the
resulting equation in (i) by a.]
f(n) = af(n/b) +c
af (n/b) = a? f(n/b*) +.ac
a’ f (n/b*) = a’ f (n/b’) +a°c
ak? f(n/b*-*) — ak f(n/bk-!) + ak¢
ak" f(n/b*') = ak f(n/b*) + ak!
We see that each of the terms af (n/b), a? f (n/b?), ..., a*~' f (n/b*') occurs one time as
a summand on both the left-hand and right-hand sides of these equations. Therefore, upon
adding both sides of the k equations and canceling these common summands, we obtain
fin) =a* f(n/b’) +[etactare+-+-+ak'c].
Since n = b‘ and f(1) = c, we have
fy =a fDt+elltatat+--.
+a"
=c[lL+ata+---ta'+a*],
1) Ifa = 1, then f(7) = c(k +1). Butn = D* & log, n =k, so f(n) = c(log, n + 1),
forn € {b'|i EN}.
e(1 — aft!) _ c(ak*! —
2) When a > 2, then f(n) = , from identity 4 of Table 9.2.
l~-a a-1
Now n = b‘ <> log, n = k, so
ak’ = gS n— (p'08o 4y!08» ne (H'08 nylog, a = phn @
and
f(a) = —————,,__ forne {b'|i EN}.
498 Chapter 10 Recurrence Relations
oy +
fd) =3, and
f(a) = f(n/2) +3, forn = 2, keZ,
So by part (1) of Theorem 10.1, with c = 3, b = 2, and a = 1, it follows that
f (n) = 3(log, 2 + 1) forn € {1, 2, 4, 8, 16, ...}.
b) Suppose that g: Z* > R with
g(1)=7, and
g(n) = 4g(n/3) +7, forn = 3*, ke Zt.
Then with c = 7, b = 3, and a = 4, part (2) of Theorem 10.1 implies that g(n) =
(7/3)(4n'e34 — 1), when n € {1, 3, 9, 27, 81, ...}.
c) Finally, consider h: Zt > R, where
hA(1)=5, and
h(n) = Th(n/7) +5, forn = 7, keZt.
Once again we use part (2) of Theorem 10.1, this time with a = b = 7 andc = 5.
Here we learn that h(n) = (5/6)(7n'°8"7 — 1) = (5/6)(7n — 1) for ne {1, 7, 49,
343, ...}.
Considering Theorem 10.1, we must unfortunately realize that although we know about
f forn € {1, b, b*,...}, we cannot say anything about the value of f for the integers in
Zt —{1, b, b?, .. .}. So at this time we are unable to deal with the concept of f as a time-
complexity function. To overcome this, we now generalize Definition 5.23, wherein the
idea of function dominance was first introduced.
Definition 10.1 Let f, g:Z* — R with S an infinite subset of Z*. We say that g dominates f on S (or f is
dominated by g on S) if there exist constants m € Rt andk € Z* such that | f(n)| < m|g(n)|
for alln € S, where n > k.
Under these conditions we also say that f € O(g) on S.
> R be defined so that
~ Ft
EXAMPLE 10.46 Let f: Z*
f(n) =n, forn €{1,3,5,7,...}
= S),
f(x) =n’, forn € {2, 4, 6,8, ...} = Sp.
Then f € O(n) on S; and f € O(n*) on S;. However, we cannot conclude that f € O(n).
From Example 10.45, it now follows from Definition 10.1 that
EXAMPLE 10.47
a) f € O(log, n) on (2*|k € N} b) g € O(n'@®4) on (3*|k EN}
c) h € O(n) on {7*|k EN}.
10.6 Divide-and-Conquer Algorithms (Optional) 499
Using Definition 10.1, we now consider the following corollaries for Theorem 10.1. The
first is a generalization of the first two results given in Example 10.47.
COROLLARY 10.1 Let a, b, ce Z* with b > 2, and let f: Z* > R. If
fQ)=c, and
f(n) =af(n/b) +c, forn = b*, k>1,
then
1) f € O(log, n) on {b*|k ¢ N}, when a = 1, and
2) f € O(n'®% “) on {b*|k EN}, when a > 2.
Proof: This proof is left as an exercise for the reader.
This second corollary changes the equal signs of Theorem 10.1 to inequalities. As a
result, the codomain of f must be restricted from R to R* U {0}.
COROLLARY 10.2 For a, b, ce Z* with b > 2, let f:Z* > R* U {0}. If
f()<ec, and
f(n) <af(n/b) +c, forn = b*, k>1,
then foralln = 1, b, b*, b°,...,
1) f € O(log, n), whena = 1, and
2) f € O(n'%*), whena > 2.
Proof: Consider the function g: Z* —> R* U {0}, where
g(1)=c, and
g(n) = ag(n/b) +c, forn €{1, b, b*,...}.
By Corollary 10.1,
geO(log,n) on {b*|keN}, whena=1, and
ge O(n“) on {b¥|k EN}, whena>2.
We claim that f(n) < g(n) foralln € {1, b, b?, .. .}. To prove our claim, we induct onk
wheren = b*.Ifk = 0,thenn = b° = Land f(1) <c = g(1) —so the result is true for this
first case. Assuming the result
is true forsomet € N, wehave f(n) = f(b’) < g(b') = g(n),
forn = b'. Then fork =t +1 andn = b* = b'*!” we find that
f(n) = f(b!) <af(b't'/b) +e = af (b') +e < ag(b') +¢ = g(b't') = g(n).
Therefore, it follows by the Principle of Mathematical Induction that f(n) < g(n) for all
neé{l,b, b?, ...}. Consequently, f € O(g) on {b*|k € N}, and the corollary follows be-
cause of our earlier statement about g.
Up to this point, our study of divide-and-conquer algorithms has been predominantly
theoretical. It is high time we gave an example in which these ideas can be applied. The
following result will confirm one of our earlier examples.
500 Chapter 10 Recurrence Relations
For n = 1, 2, 4, 8, 16,..., let f(#) count the number of comparisons needed to find the
EXAMPLE 10.48
maximum and minimum elements in a set S$ C R, where |S| =n and the procedure in
Example 10.30 is used.
If mn = 1, then the maximum and minimum elements are the same element. Therefore,
no comparisons are necessary and f(1) = 0.
Ifn > 1, thenn = 2* for somek € Z*, and we partition S as S$; U S; where |S;| = |S2| =
n/2 = 2‘! It takes f(n/2) comparisons to find the maximum M; and the minimum m, for
each set S,,i = 1, 2. Forn > 4, knowing m;, M), m2, and M>, we then compare my, with
m, and M, and M) to determine the minimum and maximum elements in S. Therefore,
f(n) = 2f(n/2) + 1, whenn
= 2, and
f(@) =2f(n/2) +2, whenn = 4, 8, 16,....
Unfortunately, these results do not provide the hypotheses of Theorem 10.1. However,
if we change our equations into the inequalities
fC) <2
f(n) <2f(n/2) +2, forn = 24, k>1,
then by Corollary 10.2 the time-complexity function f(m), measured by the number of
comparisons made in this recursive procedure, satisfies f € O(n'°82*) = O(n), forall n =
1,2,4,8,....
We can examine the relationship between this example and Example 10.30 even further.
From that earlier result, we know thatif |S| = n = 2*,k > 1, then the number of comparisons
f(n) we need (in the given procedure) to find the maximum and minimum elements in S is
(3/2)(2*) — 2. (Note: Our statement here replaces the variable n of Example 10.30 by the
variable k.)
Since n = 2*, we find that we can now write
fC) =0
f(n) = f(2*) = 3/2)2*) —2 = 3/2)n—-2, — forn = 2,4, 8, 16,....
Hence f € O(n) for n € {2*|k € N}, just as we obtained above using Corollary 10.2.
All of our results have required that n = b*, for some k EN, so it is only natural to ask
whether we can do anything in the case where n is allowed to be an arbitrary positive integer.
To find out, we introduce the following idea.
Definition 10.2 A function f: Z* — Rt U {0} is called monotone increasing if forallm,n €Zt,m<n=>
f(m) < ft).
This permits us to consider results for all n € Z* — under certain circumstances.
THEOREM 10.2 Let f: Z* — Rt U {0} be monotone increasing, and let g:Z* > R. For be Zt, b > 2,
suppose that f € O(g) forall n € S = {b*|k € N}. Under these conditions,
a) If g € O(log n), then f € O(log n).
b) Ifg € Ot logn), then f € O(m logn).
c) Ifg € O(n’), then f € O(n’), forr € Rt U {0}.
10.6 Divide-and-Conquer Algorithms (Optional) 501
Proof: We shall prove part (a) and leave parts (b) and (c) for the Section Exercises. Before
starting, we should note that the base for the logarithms in parts (a) and (b) is any positive
real number greater than 1.
Since f € O(g) on S, and g € O(log n), we at least have f € O(log n) on S. Therefore,
by Definition 10.1, there exist constants me R* and s € Z* such that f(n) = | f(n)| <
m|logn| = mlogn for all n € S,n > s. We need to find a constant M € R* such that
f(n) < M logn for all n > 5, not just those n € S.
First let us agree to choose s large enough so that log s > 1. Now let n € Z*, where
n> sbutn ¢ S.Then there exists k € Z* such thats < b* <n < b‘t!. Since f is monotone
increasing and positive,
f(n) < f(b!) < m log(b**!) = mflog(b*) + log 5]
= mlog(b*) + m log b
< mlog(b*) + m log b log(b*)
= m(1 + log b) log(b*)
< m(1 + log b) log n.
So with M = m(1 + log b) we find that for alln € Z* — S, ifn > s then f(n) < M logan.
Hence f(n) < M logn for all n € Z*, where n > s, and f € O(log n).
We shall now use the result of Theorem 10.2 in determining the time-complexity function
f (n) for a searching algorithm known as binary search.
In Example 5.70 we analyzed an algorithm wherein an array a1, a2, 43, .. . , n of inte-
gers was searched for the presence of a particular integer called key. At that time the array
entries were not given in any particular order, so we simply compared the value of key with
those of the array elements aj, a2, a3, ..., Gn. This would not be very efficient, however,
if we knew that a, < az < a3 <--- < ay. (After all, one does not search a telephone book
for the telephone number of a particular person by starting at page 1 and examining every
name in succession. The alphabetical ordering of the last names is used to speed up the
searching process.) Let us look at a particular example.
Consider the array @), a2, 43,..., @7 of integers, where a, = 2, a) = 4, a3 = 5, a4 = 7,
EXAMPLE 10.49
as = 10, a6 = 17, and a7 = 20, and let key = 9. We search this array as follows:
1) Compare key with the entry at the center of the array; here it is ag = 7. Since key >
a4, We now concentrate on the remaining elements in the subarray as, a6, a7.
2) Now compare key with the center element ag. Since key = 9 < 17 = ag, we now turn
to the subarray (of as, a, a7) that consists of those elements smaller than a¢. Here
this is only the element as.
3) Comparing key with as, we find that key # as, so key is not present in the given array
@|, 42, 03,..., a7.
From the results of Example 10.49, we make the following observations for a general
(ordered) array of integers (or real numbers). Let a), a2, a3, . . . , @, denote the given array,
502 Chapter 10 Recurrence Relations
and let key denote the integer (or real number) for which we are searching. Unlike our array
in Example 5.70, here
a, <d2<
43 < ++: <dy.
1) First we compare the value of key with the array entry at or near the center. This entry
iS G(n41)/2 for n odd or an/2 for n even,
Whether x is even or odd, the array element subscripted by c = |(n + 1)/2| is the
center, or near center, element. Note that at this point | is the value of the smallest
subscript for the array subscripts, whereas n is the value of the largest subscript.
2) If key is a,, we are finished. If not, then
a) If key exceeds a,, we search (with this dividing process) the subarray a4,
Aci2; seg Ay.
b) If key is smaller than a,, then the dividing process is applied in searching the
subarray @, 42,...,Qe-1.
The preceding observations have been used in developing the pseudocode procedure in
Fig. 10.26. Here the input is an ordered array a), a2, 43, .. - , @, of integers, or real numbers,
in ascending order, the positive integer n (for the number of entries in the given array), and
the value of the integer variable key. If the array elements are integers (real numbers), then
key should be an integer (real number). The variables s and / are integer variables used for
storing the smallest and largest subscripts for the subscripts of the array or subarray being
searched. The integer variable c stores the index for the array (subarray) element at, or near,
the center of the array (subarray). In general, c = |(s + /)/2]. The integer variable /ocation
stores the subscript of the array entry where key is located; the value of location is 0 when
key is not present in the given array.
procedure BinarySearch(n: positive integer; key, a),a,a3,...,a,: integers)
begin
gs:=1 {sis the smallest subscript of the subarray being searched}
l:=n {1 isthe largest subscript of the subarray being searched}
location :=0
while s </do
begin
c:=|[(s+1)/2]
if key = a, then
begin
location :=c
s:=/+1
end
else if key < a, then
l:=sc-1
elses :=cil
end
end
Figure 10.26
We want to measure the (worst-case) time complexity for the algorithm implemented
in Fig. 10.26. Here f(n) will count the maximum number of comparisons (between key
10.6 Divide-and-Conquer Algorithms (Optional) 503
and a.) needed to determine whether the given number key appears in the ordered array
GQ), A2, A3,..., ay.
® Forn = 1, key is compared to a, and f(1) = 1.
@ When n = 2, in the worst case key is compared to a, and then to a2, so f (2) = 2.
In the case ofn = 3, f (3) = 2 (in the worst case).
@ When n = 4, the worst case occurs when key is first compared to a, and then a binary
search of a3, a4 follows. Searching a3, a4 requires (in the worst case) f (2) comparisons.
So f(4) =1+4 fQ) =3.
At this point we see that f(1) < f(2) < f(3) < f(4), and we conjecture that f is a
monotone increasing function. To verify this, we shall use the Principle of Mathemati-
cal Induction in its alternative form. Here we assume that for all 7, 7 € {1, 2,3,..., 7},
i<j= fi) < fC). Now consider the integer n + 1. We have two cases to examine.
1) n + 1 is odd: Here we write n = 2k andn+ 1=2k+41, for some k € Z*. In the
worst case, f(n +1) = f(2k +1) = 1+ f(k), where 1 counts the comparison of
key with a,,,, and f(k) counts the (maximum) number of comparisons needed in a
binary search of the subarray a), a2, ..., ag or the subarray ay42, @e43,.--, Gak4t-
Now f(n) = f(2k) = 1+ max{f(k — 1), f(k)}. Since k — 1, k <n, by the in-
duction hypothesis we have f(k — 1) < f(k), so f(v) = 14+ f(k) = f+).
2) n+ 1 is even: At this time we have n + 1 = 2r, for some r € Z™, and in the worst
case, f(n + 1) = 1+ max{f(r — 1), f(r)} = 1+ f(), by the induction hypothesis.
Therefore,
f(y = f2r-l=1+fr-Ds1l+fO=fartdb.
Consequently, the function f is monotone increasing.
Now it is time to determine the worst-case time complexity for the binary search algo-
rithm, using the function f(n). Since
fQ)=1, and
f(n) = f(n/2) +1, forn = 2*, k>1,
it follows from Theorem 10.1 (with a = 1, b = 2, and c = 1) that
f(n)=log,n+i1, and f € O(log, n) forn € {1, 2,4, 8,...}.
But with f monotone increasing, from Theorem 10.2 it now follows that f € O(log, n) (for
all n € Zt), Consequently, binary search is an O(log, n) algorithm, whereas the searching
algorithm of Example 5.70 is O(n). Therefore, as the value of n increases, binary search
is the more efficient algorithm — but then it requires the additional condition that the array
be ordered.
This section has introduced some of the basic ideas in the study of divide-and-conquer
algorithms. It also extends the material first introduced on computational complexity and
the analysis of algorithms in Sections 5.7 and 5.8.
The Section Exercises include some extensions of the results developed in this section.
The reader who wants to pursue this topic further should find the chapter references both
helpful and interesting.
504 Chapter 10 Recurrence Relations
8. a) Modify the procedure in Example 10.48 as follows: For
EXERCISES 10.6 any §S CR, where |S| =n, partition S as S$, U S;, where
|S)| = |S>|, form even, and |S,| = 1 + |S>|, form odd. Show
1. In each of the following, f: Z* — R. Solve for f (n) rela- that if f(m) counts the number of comparisons needed (in
tive to the given set §, and determine the appropriate “big-Oh” this procedure) to find the maximum and minimum ele-
form for f on S. ments of S, then f is a monotone increasing function.
a) f() =5 b) What is the appropriate “‘big-Oh” form for the function
f(n) =4f(n/3) +5, n=3,9,27,... f of part (a)?
S = {3'|i EN}
9. In Corollary 10.2 we were concerned with finding the
b) fC) =7 appropriate “big-Oh” form for a function f: Z* > R* U {0}
f(n) = f(n/S)+7, n=5,25,125,... where
S = {5'|i EN} fd) <c, forc
€ Zt
2. Let a, b,c € Z* with b > 2, and let d € N. Prove that the
f(a) <af(n/b) +c,
solution for the recurrence relation
fora, be Z* withhb>2, andn=b',k eZ.
fdjy=d
Here the constant ¢ in the second inequality is interpreted as
f(n) = af(a/b)+e, n=b*, k>1
the amount of time needed to break down the given problem
satisfies of size n into a smaller (similar) problems of size n/b and to
a) f(n)=d+clog,n, forn = b*,k EN, whena = 1, combine the a solutions of these smaller problems in order to
b) f(n) = dn! + (c/(a — 1))[n'4 — 1], for n = dF, get a solution for the original problem of size n. Now we shall
k €N, whena > 2. examine a situation wherein this amount of time is no longer
constant but depends on n.
3. Determine the appropriate “big-Oh" forms for f on
{b*|k € N} in parts (a) and (b) of Exercise 2. a) Leta, b,c € Z*, with b > 2. Let f: Z* > R* U {0} be
a monotone increasing function, where
4. In each of the following, f: Z* > R. Solve for f (a) rela-
tive to the given set S, and determine the appropriate “‘big-Oh” fd)se
form for f on S. f(n) <af(n/b) +n, forn = b*, keZ.
a) f(1) =90 Use an argument similar to the one given (for equalities)
f(r) = 2f(n/5)+3, n=5, 25, 125,... in Theorem 10.1 to show that for all n = 1, b, b?, b°,...,
S = {5'ji EN} k
b) fC) =1 fn) <cn S(a/by'.
f(n) = f(n/2) +2, n=2,4,8,... :=0
S = {2'|i EN} b) Use the result of part (a) to show that f € O(n log jn),
where a = b. (The base for the log function here is any real
5. Consider a tennis tournament for » players, where n = 2
number greater than 1.)
k €Z*. In the first round 2/2 matches are played, and the n/2
winners advance to round 2, where n/4 matches are played. c) Whena # b, show that part (a) implies that
This halving process continues until a winner is determined.
f(n)< (“ ) (att! — pet),
a) Forn = 2°, k € Z*, let f(n) count the total number of a—b
matches played in the tournament. Find and solve a recur- d) From part (c), prove that (i) f € O(n), whena < b; and
rence relation for f (7) of the form (ii) f € O(n'®>*), when a > b. [Note: The “big-Oh” form
fG)=d forf here and in part (b) is for f on Z*, not just {b*|k € N}.]
n=2,4,8,..., 10. In this exercise we briefly introduce the Master Theorem.
f(n) = af (n/2) +e,
(For more on this result, including a proof, we refer the reader
where a, c, and d are constants.
to pp. 73-84 of reference [5] by T. H. Cormen, C. E. Leiserson,
b) Show that your answer in part (a) also solves the recur- R. L. Rivest, and C. Stein.)
rence relation Consider the recurrence relation
f()=d fC) = 4,
f(r) = f(a/2)
+ (@/2). n=2,4,8,.... f(n) = af (n/b) + h(n),
6. Complete the proofs for Corollary 10.1 and parts (b) and whereneZ',n>laeZt,a<n,andbeRt, 1 <b<n,
(c) of Theorem 10.2. The function A accounts for the time (or cost) of dividing the
7. What is the best-case time-complexity function for binary given problem of size n into a smaller (similar) problems of
search? size approximately n/b and then combining the results from
10.7 Summary and Historical Review 505
the a smaller problems. Further, there exists k ¢ Z* such that 1) f(a) = 16fi(n/4) +n
h(n) > O for all n > k. (Since n/b need not be an integer, the Here a = 16, b = 4, n'84 = plots © = yp? and h(n) = 21.
recurrence relation is not properly defined. To get around this So h € O(n'°8s !®-*) with € = 1. Consequently, h falls un-
we need to replace n/b by either [n/b| orfn/b]. But as this der the hypothesis for case (i) and it follows that f € O(n’).
does not affect the outcome of the result, for large values of n,
we shall not concern ourselves with such details.)
2) f(n) = fBn/4) +5
Now we have a = 1, b = 4/3, n'0® 4 = n'843! = n° = 1,
Under the above hypothesis we find the following [where © and h(n) = 5. Consequently, 2 € @(n'°83') and from case
(big theta) and &2 (big omega) are as given in Exercises 11-16 (ii) we learn that f ¢ O(n'®4 ' log, n) = O(log, 7).
for Section 5.7]:
3) f(a) =7f(a/8) +n log, a
i) If A € O(n'4-£), for some fixed € > 0, then f € For this recurrence relation we have a=7, b=8,
@(niee 4); n'o8p 4 = poss? = n° and A(n)=nlog,n. So he
ii) Ifh € O(n'®*), then f € O(n'? log, n); and Q (nies 7+) where e€ =0.064>0. Further, for all
iii) If A € Q(n'%4+*) for some fixed € > 0, and if sufficiently large an, ah(n/b) = 7(n/8) log,(n/8) =
ah(n/b) <c h(n), for some fixed c, where 0 < ¢ < 1, (7/8)n[log, n — log, 8) < (7/8)n log, n =c h(n), for
and for all sufficiently large n, then f € O(h). 0<c=7/8 < 1. Thus, A satisfies the hypotheses for case
(ili) and we have f € O(n log, 7).
In all three cases, the function h is compared with n!°&
and, roughly speaking, the Master Theorem then determines the Use the Master Theorem to determine the complexity of f
complexity of the solution f(m) as the larger of the two func- in each of the following, where f (1) = 1:
tions in cases (i) and (iii), while in case (ii) we find the added
factor log, n. However, it is important to realize that there are
a) f(n)=9f(n/3)+n — db) f(r) =2f(n/2) +1
some recurrence relations of this type that do not fall under any c) fin) = fn/3)+1 d) f(a) =2f(n/3)+a
of these three cases. e) f(n) =4f(n/2) +n?
For now we consider the following, where f(1) = | for all
three examples.
10.7
Summary and Historical Review
In this chapter the recurrence relation has emerged as another tool for solving combinatorial
problems. In these problems we analyze a given situation and then express the result a, in
terms of the results for certain smaller nonnegative integers. Once the recurrence relation
is determined, we can solve for any value of a, (within reason). When we have access
to a computer, such relations are particularly valuable, especially if they cannot be solved
explicitly.
The study of recurrence relations can be traced back to the Fibonacci relation Fy,42 =
Fait + Fy,n = 0, Fo = 0, F; = 1, which was given by Leonardo of Pisa (c. 1175—1250) in
1202. In his Liber Abaci, he deals with a problem concerning the number of pairs of rabbits
that result in one year if one starts with a single pair that produces another pair at the end
of each month. Each new pair starts to breed similarly one month after its birth, and we
assume that no rabbits die during the given year. Hence, at the end of the first month there
are two pairs of rabbits; three pairs after two months; five pairs after three months; and so
on. [As mentioned in the summary of Chapter 9, Abraham DeMoivre (1667-1754) obtained
this result by the method of generating functions in 1718.] This same sequence appears in
the work of the German mathematician Johannes Kepler (1571-1630), who used it in his
studies on how the leaves of a plant or flower are arranged about its stem. In 1844 the
French mathematician Gabriel Lamé (1795-1870) used the sequence in his analysis of the
efficiency of the Euclidean algorithm. Later, Frangois Edouard Anatole Lucas (1842-1891),
who popularized the Towers of Hanoi puzzle, derived many properties of this sequence and
was the first to call these numbers the Fibonacci sequence.
506 Chapter 10 Recurrence Relations
Leonardo Fibonacci (c. 1175-1250)
Reproduced courtesy of The Granger Collection, New York
For an elementary coverage of examples and properties for the Fibonacci numbers one
should examine the book by T. H. Garland [10]. Even more can be learned from the texts by
V. E. Hoggatt, Jr. [14] and S. Vajda [29]. The UMAP article by R.V. Jean [16] gives many
applications of this sequence. Chapter 8 of the mathematical exposition by R. Honsberger
[15] provides an interesting account of the Fibonacci numbers and of the related sequence
called the Lucas numbers. The text by R. L. Graham, D. E. Knuth, and O. Patashnik [12]
also includes many interesting examples and properties of both the Fibonacci numbers
and the Catalan numbers. More counterexamples for the Fibonacci and Catalan numbers,
like those found in Examples 10.17 and 10.44, respectively, can be found in the article by
R. K. Guy [13]. Additional material on the role of the golden ratio in such areas as geometry,
probability, and fractals is given in the book by H. Walser [30]. The book by T. Koshy [19]
provides a definitive history and extensive analysis of the Fibonacci and Lucas numbers,
together with a wide variety of applications, examples, and exercises.
Comparable coverage of the material presented in this chapter can be found in Chapter 3
of C. L. Liu [21]. For more on the theoretical development of linear recurrence relations
with constant coefficients, examine Chapter 9 of N. Finizio and G. Ladas [8].
Applications in probability theory dealing with recurrent events, random walks, and ruin
problems can be found in Chapters XTII and XIV of the classic text by W. Feller [7]. The
UMAP module by D. R. Sherbert [24] introduces difference equations and includes an
application in economics known as the Cobweb Theorem. The text by S. Goldberg [11] has
more on applications in the social sciences.
Recursive techniques in the generation of permutations and combinations are developed
in Chapter 4 of R. A. Brualdi [3]. The algorithm presented in Section 10.1 for the permu-
tations of {1, 2, 3,..., } first appeared in the work of H. D. Steinhaus [27] and is often
referred to as the adjacent mark ordering algorithm. This result was rediscovered later,
independently by H. F. Trotter [28] and S. M. Johnson [17]. Efficient sorting methods for
permutations and other combinatorial structures are analyzed in the text by D. E. Knuth
[18]. The work of E. M. Reingold, J. Nievergelt, and N. Deo [23] also deals with such
algorithms.
For those who enjoyed the rooted ordered binary trees in Section 10.5, Chapter 3 of
A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1] should prove interesting. The basis for the
References 507
example on stacks is given on page 86 of the text by S. Even [6]. The article by M. Gardner [9]
provides other examples where the Catalan numbers arise. Computational considerations in
determining Catalan numbers are examined in the article by D. M. Campbell [4]. Much more
about the Catalan numbers can be found in the text by R. P. Stanley [26] —in particular, 66
situations, where these numbers arise, are provided on pp. 219-229.
Finally, the coverage on divide-and-conquer algorithms in Section 10.6 is modeled after
D. F. Stanat and D. F McAllister’s presentation in Section 5.3 of [25]. Chapter 10 of the
text by A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1] provides some further information
on this topic. An application of this method in a matrix multiplication algorithm appears in
Chapter 10 of the text by C. L. Liu [20]. Additional coverage and a proof for the Master
Theorem are given in Chapter 4 of the text by T. H. Cormen, C. E. Leiserson, R. L. Rivest,
and C. Stein [5].
REFERENCES
1. Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffery D. Data Structures and Algorithms.
Reading, Mass.: Addison-Wesley, 1983.
2. Auluck, F. C. “On Some New Types of Partitions Associated with Generalized Ferrers Graphs.”
Proceedings of the Cambridge Philosophical Society 47 (1951): pp. 679-685.
3. Brualdi, Richard A. Introductory Combinatorics, 3rd ed. Upper Saddle River, N.J.: Prentice-
Hall, 1999.
4, Campbell, Douglas M. “The Computation of Catalan Numbers.” Mathematics Magazine 57,
no. 4 (September 1984): pp. 195-208.
5. Cormen, Thomas H., Leiserson, Charles E., Rivest, Ronald L., and Stein, Clifford. Introduction
to Algorithms, 2nd ed. Boston, Mass.: McGraw-Hill, 2001.
6. Even, Shimon. Graph Algorithms. Rockville, Md.: Computer Science Press, 1979.
7. Feller, William. An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed.
New York: Wiley, 1968.
8. Finizio, N., and Ladas, G. An Introduction to Differential Equations. Belmont, Calif.:
Wadsworth Publishing Company, 1982.
9. Gardner, Martin. “Mathematical Games, Catalan Numbers: An Integer Sequence that Materi-
alizes in Unexpected Places.” Scientific American 234, no. 6 (June 1976): pp. 120-125.
10. Garland, Trudi Hammel. Fascinating Fibonaccis. Palo Alto, Calif.: Dale Seymour Publica-
tions, 1987.
11. Goldberg, Samuel. Introduction to Difference Equations. New York: Wiley, 1958.
12. Graham, Ronald Lewis, Knuth, Donald Ervin, and Patashnik, Oren. Concrete Mathematics,
2nd ed. Reading, Mass.: Addison-Wesley, 1994.
13. Guy, Richard K. “The Second Strong Law of Small Numbers.” Mathematics Magazine 63,
no. | (February 1990): pp. 3-20.
14, Hoggatt, Verner E., Jr. Fibonacci and Lucas Numbers. Boston, Mass.; Houghton Mifflin, 1969.
15. Honsberger, Ross. Mathematical Gems II (The Dolciani Mathematical Expositions, Number
Nine). Washington, D.C.: The Mathematical Association of America, 1985.
16. Jean, Roger V. “The Fibonacci Sequence.” The UMAP Journal 5, no. 1 (1984): pp. 23-47.
17, Johnson, Selmer M. “Generation of Permutations by Adjacent Transposition.” Mathematics
of Computation 17 (1963): pp. 282-285.
18. Knuth, Donald E. The Art of Computer Programming/Volume 3 Sorting and Searching. Read-
ing, Mass: Addison-Wesley, 1973.
19. Koshy, Thomas. Fibonacci and Lucas Numbers with Applications. New York: Wiley, 2001.
20. Liu, C. L. Elements of Discrete Mathematics, 2nd ed. New York: McGraw-Hill, 1985.
21. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
22. Miksa, F. L., Moser, L., and Wyman, M. “Restricted Partitions of Finite Sets.” Canadian
Mathematics Bulletin | (1958): pp. 87-96.
508 Chapter 10 Recurrence Relations
23. Reingold, E. M., Nievergelt, J., and Deo, N. Combinatorial Algorithms: Theory and Practice.
Englewood Cliffs, N.J.: Prentice-Hall, 1977.
24. Sherbert, Donald R. Difference Equations with Applications, UMAP Module 322. Cambridge,
Mass.: Birkhauser Boston, 1980.
25. Stanat, Donald F., and McAllister, David F. Discrete Mathematics in Computer Science. En-
glewood Cliffs, N.J.: Prentice-Hall, 1977.
26. Stanley, Richard P. Enumerative Combinatorics, Vol. 2. New York: Cambridge University
Press, 1999.
27. Steinhaus, Hugo D. One Hundred Problems in Elementary Mathematics. New York: Basic
Books, 1964.
28. Trotter, H. F. “ACM Algorithm 115 — Permutations.” Communications of the ACM 5 (1962):
pp. 434-435.
29. Vajda, S. Fibonacci & Lucas Numbers, and the Golden Section. New York: Halsted Press (a
division of John Wiley & Sons), 1989.
30. Walser, Hans. The Golden Section. Washington, D.C.: The Mathematical Association of Amer-
ica, 2001.
a) Compute M?, M°, and M+.
SUPPLEMENTARY EXERCISES b) Conjecture a general formula for M", n € Z*, and es-
tablish your conjecture by the Principle of Mathematical
Induction.
1. For ne Z* and n>k+1>1, verify algebraically the
recursion formula 7. Determine the points of intersection of the parabola y =
x’ — 1 and the hyperbola y = 1+ +.
(. ) - (731) (:) 8. Leta = (1 + /5)/2 and B = (1 — J5)/2.
2. a) For n> 0, let B, denote the number of partitions of
a) Verify thata? =a +1 and B? = 6B +1.
{1,2,3,..., nj}. Set By = 1 for the partitions of %. Verify b) Prove that for alln > 0, oy _y (2)Fu = Fun.
that for all x > 0, c) Show that a? = 1+ 2@ and 6? = 1 + 26.
Bra =o (,0,)8 => (7)e. d) Prove that for all n > 0, )oy-5 (f)2* Fe = Fin.
9, a) For a = (1+ J5)/2, verify that a2? + 1 = 2+ and
(2+a)* = 5a’,
[The numbers B,, i > 0, are referred to as the Bell numbers
b) Show that for 6 = (1 — /5)/2, 6? +1=246 and
after Eric Temple Bell (1883-1960).]}
(2+ By = 5p’.
b) How are the Bell numbers related to the Stirling num-
¢) Ifn, m €N prove that
bers of the second kind?
2n
3. Letn, k € Z*, and define p(n, k) to be the number of par-
titions of n into exactly k (positive-integer) summands. Prove Yo G2) Paci = 5" Fanim:
k=0
that p(n, k) = p(n —1,k —1)+ p(n —k, k).
10. Renu wants to sell her laptop for $4000. Narmada offers to
4. For n > 1, let a, count the number of ways to write n as
buy it for $3000. Renu then splits the difference and asks for
an ordered sum of odd positive integers. (For example, a4 =
$3500. Narmada likewise splits the difference and makes a new
3sincee4=3+1=14+3=1+4+1+1+41.) Find and solve a
offer of $3250. (a) If the women continue this process (of ask-
recurrence relation for a,.
ing prices and counteroffers), what will Narmada be willing to
i 1 pay on her 5th offer? 10th offer? kth offer, k > 1? (b) If the
5. Let
A = k a:
women continue this process (providing many, many new ask-
a) Compute A’, A®, and A‘. ing prices and counteroffers), what price will they approach?
(c) Suppose that Narmada was willing to buy the laptop for
b) Conjecture a general formula for A", n € Z*, and es-
$3200. What should she have offered to pay Renu the first time?
tablish your conjecture by the Principle of Mathematical
Induction. 11. Parts (a) and (b) of Fig. 10.27 provide the Hasse diagrams
1 ] for two partial orders referred to as the fences #5, #e [on 5, 6
6. Let
M = i >
(distinct) elements, respectively}. If, for instance, R denotes the
Supplementary Exercises 509
partial order for the fence #5, then a; Ray, a3 R ay, ax Rag, forn > 2, verify that f(x) = (e-*)/(i — x). Hence
and a; R a4. For each such fence ¥,, n > 1, we follow the
convention that an element with an odd subscript is minimal
and one with an even subscript is maximal. Let ({1, 2}, <) de-
note the partial order where < denotes the usual “less than or 16. For n > 0, draw n ovals in the plane so that each oval in-
equal to” relation. As in Exercise 26 of Section 7.3, a func- tersects each of the others in exactly two points and no three
tion f:%, — {1, 2} is called order-preserving when for all ovals are coincident. If a, denotes the number of regions in the
x, VER, xRy => f(x) < f(y). Let c, count the number of plane that results from these n ovals, find and solve a recurrence
such order-preserving functions. Find and solve a recurrence relation for a,.
relation for c,. 17. For > 0, let us toss a coin 2” times.
a) If a, is the number of sequences of 2m tosses where n
a2 ag by Da bg heads and » tails occur, find a, in terms of n.
b) Find constants r, s, and t so that (r + sx)' = f(x) =
yo an x”.
c) Let b, denote the number of sequences of 2n tosses
where the numbers of heads and tails are equal for the first
ay a3 a5 b, b3 Ds
time only after all 2” tosses have been made. (For example,
(a) Bs (b) 86 if n = 3, then HHHTTT and HHTHTT are counted in b,,
but HTHHTT and HHTTHT are not.)
Figure 10.27 Define b) = 0 and show that for all n > 1,
a, = agb, + aby) tee + a, D, + a, bo.
12. For n>O0, let m= [(n+1)/2]. Prove that F,,.=
Yeey ("£7 '). (You may want to look back at Examples 9.17 d) Let g(x) = 09 b,x”. Show that g(x) = 1 — 1/f (x),
and 10.11.) and then solve for b,, 2 > 1.
13. a) For n € Z*, determine the number of ways one can tile 18. For a = (1+ J/5)/2 and B = (1 — JV5)/2, show -that
a 1 Xn chessboard using | X 1 white (square) tiles and yy BY = ~B =a — 1 and that °°, [Blk = a2.
1 X 2 blue (rectangular) tiles. 19. Let a, b, c be fixed real numbers with ab = 1 and let
b) How many of the tilings in part (a) use (i) no blue tiles; f:RXR-—R be the binary operation, where f(x, y) =a+
(ii) exactly one blue tile; (iii) exactly two blue tiles; (iv) ex- bxy + c(x + y). Determine the value(s) of c for which f will
actly three blue tiles; and (v) exactly & blue tiles, where be associative.
O<k <[n/2}? 20. a) For w = (1+ J5)/2 and f = (1 — J5)/2, verify that
o—a?*%=a—fp= fp? - p’.
c) How are the results in parts (a) and (b) related?
b) Prove that Fy, = F2,, — F2_,,n 21.
14, Lete= y1 +71 +J/1+/1+-.--.
How is c? related to c) Forn > 1, let T be an isosceles trapezoid with bases of
c? What is the value of c? length F,_; and F,,,), and sides of length F,,. Prove that the
area of T is (13/4) Foy. [Note that, when n = 1, the trape-
15. For n € Z*, d, denotes the number of derangements of
zoid degenerates into a triangle. However, the formula is
{1,2,3,..., nm}, as discussed in Section 8.3.
still correct.]
a) If n > 2, show that d, satisfies the recurrence relation
21. Let ¥ be the sample space for an experiment ©. If A, B are
events from & with AUB = %, AN B =, Pr(A) = p, and
d,, = (n a 1)(d,-1 + d,-2), dy = 1, d = 0.
Pr(B) = p’, determine p.
b) How can we define dp so that the result in part (a) is 22. De’ Jzaun and Sandra toss a loaded coin, where Pr(H) =
valid for n > 2? p > 0. The first to obtain a head is the winner. Sandra goes first
c) Rewrite the result in part (a) as but, if she tosses a tail, then De’Jzaun gets two chances. If he
tosses two tails, then Sandra again tosses the coin and, if her
d, ~~ ndy—| = —[dy-| a (n ~~ 1)d,_2}. toss is a tail, then De’Jzaun again goes twice (if his first toss is
a tail). This continues until someone tosses a head. What value
How can d, — nd,_, be expressed in terms of d,_2, d,_3? of p makes this a fair game (that is, a game where both Sandra
d) Show that d, — nd,_,; = (-1)". and De’Jzaun have probability 5 of winning)?
e) Let f(x) = oy (d,x")/n!. After multiplying both 23. Forn > 1, leta, countthe number of binary strings of length
sides of the equation in part (d) by x"/n! and summing n, where there is no run of 1’s of odd length. Consequently,
510 Chapter 10 Recurrence Relations
when # = 6, for instance, we want to include the strings 110000 they play until one of them is broke, what is the probability that
(which has a run of two 1’s and a run of four 0’s) and 011110 Cathy gets wiped out?
(which has two runs of one 0 and one run of four 1’s), but we
29. For n,m € Z*, let f (nm, m) count the number of partitions
do not include either 100011 (which starts with a run of one 1)
of n where the summands form a nonincreasing sequence of
or 110111 (which ends with a run of three 1’s). Find and solve
positive integers and no summand exceeds m. With n = 4 and
a recurrence relation for a,.
m = 2, for example, we find that f(4, 2) = 3 because here we
24. Let a, b be fixed nonzero real numbers. Determine x,, if are concerned with the three partitions
Xy = Xy-1%y-2, nN > 2,X) = a, xX) = dD.
25. a) Evaluate FR — F,Fysy — F? forn = 0, 1, 2, 3.
4=24+2, 4=24141, 4=1414141.
b) From the results in part (a), conjecture a formula for a) Verify that for alln,m €Z*,
FO — FaFaai — F? forn eN.
f(n,m) = f(n-—m,m)+ fla,m—1).
c) Establish the conjecture in part (b) using the Principle
b) Write a computer program (or develop an algorithm) to
of Mathematical Induction.
compute f(n, m) forn,meZ*.
26. Let n € Z*. On a 1 X va chessboard two kings are called
nontaking, if they do not occupy adjacent squares. In how many c) Write a computer program (or develop an algorithm) to
ways can one place 0 or more nontaking kings ona 1 X a chess-
compute p(n), the number of partitions of a given positive
integer n.
board?
27. a) For 1 <i <6, determine the rook polynomial r(C;, x) 30. Let A, B be sets with |A| =m >n = |B\, and let a(m, n)
for the chessboard C, shown in Fig. 10.28. count the number of onto functions from A to B. Show that
b) For each rook polynomial in part (a), find the sum of the atm, 1) = 1
coefficients of the powers of x — that is, determine r(C,, 1) n—-1
for! <i <6. a(m,n) = n™ -> ("acm i), whenm>n> 1.
i=] \!
28. (Gambler’s Ruin) When Cathy and Jill play checkers, each
has probability 5 of winning. There is never a tie, and the games 31. When one examines the units digit of each Fibonacci num-
are independent in the sense that no matter how many games the ber F,,, n > 0, one finds that these digits form a sequence that
girls have played, each girl still has probability ; of winning repeats after 60 terms. [This was first proved by Joseph-Louis
the next game. After each game the loser gives the winner a Lagrange (1736—1813).] Write a computer program (or develop
quarter. If Cathy has $2.00 to play with and Jill has $2.50 and an algorithm) to calculate this sequence of 60 digits.
C, C; Co
Figure 10.28
PART
3
GRAPH
THEORY AND
APPLICATIONS
An Introduction
to Graph Theory
Wi this chapter we start to develop another major topic of this text. Unlike other areas
in mathematics, the theory of graphs has a definite starting place, a paper published
in 1736 by the Swiss mathematician Leonhard Euler (1707-1783). The main idea behind
this work grew out of a now-popular problem known as the seven bridges of Kénigsberg.
We shall examine the solution of this problem, from which Euler developed some of the
fundamental concepts for the theory of graphs.
Unlike the continuous graphs of early algebra courses, the graphs we examine here are
finite in structure and can be used to analyze relationships and applications in many differ-
ent settings. We have seen some examples of applications of graph theory in earlier
chapters (3, 5-8, and 10). However, the development here is independent of these prior dis-
cussions.
11.1
Definitions and Examples
When we use a road map, we are often concerned with seeing how to get from one town
to another by means of the roads indicated on the map. Consequently, we are dealing with
two distinct sets of objects: towns and roads. As we have seen many times before, such sets
of objects can be used to define a relation. If V denotes the set of towns and E the set of
roads, we can define a relation ® on V by a % b if we can travel from a to b using only the
roads in F. If the roads in £ that take us from a to b are all two-way roads, then we also
have b & a. Should all the roads under consideration be two-way, we have a symmetric
relation.
One way to represent a relation is by listing the ordered pairs that are its elements.
Here, however, it is more convenient to use a picture, as shown in Fig. 11.1. This figure
demonstrates the possible ways of traveling among six towns using the eight roads indicated.
It shows that there is at least one set of roads connecting any two towns (identical or distinct).
This pictorial representation is a lot easier to work with than the 36 ordered pairs of the
relation &.
At the same time, Fig. 11.1 would be appropriate for representing six communication
centers, with the eight “roads” interpreted as communication links. If each link provides
two-way communication, we should be quite concerned about the vulnerability of center a
to such hazards as equipment breakdown or enemy attack. Without center a, neither b nor
c can communicate with any of d, e, or f.
From these observations we consider the following concepts.
513
514 Chapter 11 An Introduction to Graph Theory
Figure 11.1 Figure 11.2
Definition 11.1 Let V bea finite nonempty set, andlet E C V X V. The pair (V, £) is then called a directed
graph (on V), or digraph' (on V), where V is the set of vertices, or nodes, and E is its set
of (directed) edges or arcs. We write G = (V, E) to denote such a graph.
When there is no concern about the direction of any edge, we still write G = (V, E). But
now E£is a set of unordered pairs of elements taken from V, and G is called an undirected
graph,
Whether G = (V, £) is directed or undirected, we often call V the vertex set of G and
E the edge set of G.
Figure 11.2 provides an example of a directed graph on V = {a, b, c, d, e} with E =
{(a, a), (a, b), (a, d), (6, c)}. The direction of an edge is indicated by placing a directed
arrow on the edge, as shown here. For any edge, such as (b, c), we say that the edge is
incident with the vertices b, c; b is said to be adjacent to c, whereas c is adjacent from b.
In addition, vertex 5 is called the origin, or source, of the edge (b, c), and vertex c is the
terminus, or terminating vertex. The edge (a, a) is an example of a loop, and the vertex e
that has no incident edges is called an isolated vertex.
An undirected graph is shown in Fig. 11.3(a). This graph is a more compact way of
describing the directed graph given in Fig. 11.3(b). In an undirected graph, there are undi-
rected edges suchas {a, b}, {b, c}, {a, c}, {c, d} in Fig. 11.3(a). An edge such as {a, b} stands
for {(a, 5), (b, a)}. Although (a, b) = (b, a) only whena = Bb, we do have {a, b} = {b, a}
d d
(a) (b)
Figure 11.3
* Since the terminology of graph theory is not standard, the reader may find some differences between terms
used here and in other texts.
11.1 Definitions and Examples 515
for any a, b. We can write {a, a} to denote a loop in an undirected graph, but {a, a} is
considered the same as (a, a).
In general, if a graph G is not specified as directed or undirected, it is assumed to be
undirected. When it contains no loops it is called loop-free.
In the next two definitions we shall not concern ourselves with any loops that may be
present in the undirected graph G.
Definition 11.2 Let x, y be (not necessarily distinct) vertices in an undirected graph G = (V, E). An x-y
walk in G is a (loop-free) finite alternating sequence
X= XO, C1, X1,
C2, X2, C3, 6 6 5 Cn—~1s Xn—13 Cns An = Y
of vertices and edges from G, starting at vertex x and ending at vertex y and involving the
n edges e; = {x;_1, x;}, where 1 <i <n.
The length of this walk is n, the number of edges in the walk. (When n = 0, there are no
edges, x = y, and the walk is called trivial. These walks are not considered very much in
our work.)
Any x-y walk where x = y (and n > 1) is called a closed walk. Otherwise the walk is
called open.
Note that a walk may repeat both vertices and edges.
EXAMPLE 11.1 For the graph in Fig. 11.4 we find, for example, the following three open walks. We can list
the edges only or the vertices only (if the other is clearly implied).
1) {a, 5}, {b, d}, {d, c}, {c, e}, {e, a}, {d, b}: This is an a-b walk of length 6 in which
we find the vertices d and b repeated, as well as the edge {b, d} (= {d, b}).
2) b>c7d-7-e->c-— f: Here we have a b-f walk where the length is 5 and the
vertex c is repeated, but no edge appears more than once.
3) {f, c}, {c, e}, {e, d}, {d, a}: In this case the given fa walk has length 4 with no
repetition of either vertices or edges.
Figure 11.4
Since the graph of Fig. 11.4 is undirected, the a-b walk in part (1) is also a b-a walk
(we read the edges, if necessary, as {b, d}, {d, e}, {e, c}, {c, d}, {d, b}, and {b, a}). Similar
remarks hold for the walks in parts (2) and (3).
Finally, the edges {b, c}, {c, d}, and {d, b} provide a b-b (closed) walk. These edges
(ordered appropriately) also define (closed) c-c and d-d walks.
516 Chapter 11 An Introduction to Graph Theory
Now let us examine special types of walks.
Definition 11.3 Consider any x-y walk in an undirected graph G = (V, F).
a) If no edge in the x-y walk is repeated, then the walk is called an x-y trail. A closed
x-x trail is called a circuit.
b) If no vertex of the x-y walk occurs more than once, then the walk is called an x-y
path, When x = y, the term cycle is used to describe such a closed path.
Convention: In dealing with circuits, we shall always understand the presence of at least
one edge. When there is only one edge, then the circuit is a loop (and the graph is no longer
loop-free). Circuits with two edges arise in multigraphs, a concept we shall define shortly.
The term cycle will always imply the presence of at least three distinct edges (from the
graph).
a) The b-f walk in part (2) of Example 11.1 is a b-f trail, but it is not a b-f path because
EXAMPLE 11.2
of the repetition of vertex c. However, the f-a walk in part (3) of that example is both
an f-a trail (of length 4) and an f-a path (of length 4).
b) In Fig. 11.4, the edges {a, b}, {b, d}, {d, c}, {c, e}, {e, d}, and {d, a} provide an a-a
circuit. The vertex d is repeated, so the edges do not give us an a-a cycle.
c) The edges {a, b}, {b, c}, {c, d}, and {d, a} provide an a-a cycle (of length 4) in
Fig. 11.4. When ordered appropriately these same edges may also define a b-b, c-c, or
d-d cycle. Each of these cycles is also a circuit.
For a directed graph we shall use the adjective directed, as in, for example, directed
walks, directed paths, and directed cycles.
Before continuing, we summarize (in Table 11.1) for future reference the results of
Definitions 11.2 and 11.3. Each occurrence of “Yes” in the first two columns here should
be interpreted as “Yes, possibly.” Table 11.1 reflects the fact that a path is a trail, which in
turn is an open walk. Furthermore, every cycle is a circuit, and every circuit (with at least
two edges) is a closed walk.
Table 11.1
Repeated Vertex | Repeated
(Vertices) Edge(s) | Open } Closed Name
Yes Yes Yes Walk (open)
Yes Yes Yes Walk (closed)
Yes No Yes Trail
Yes No Yes Circuit
No No Yes Path
No No Yes Cycle
Considering how many concepts we have introduced, it is time to prove a first result in
this new theory.
11.1 Definitions and Examples 517
THEOREM 11.1 Let G = (V, E) be an undirected graph, with a, b € V,a # b. If there exists a trail (in G)
from a to b, then there is a path (in G) from a to b.
Proof: Since there is a trail from a to b, we select one of shortest length, say {a, x;},
{x1, X2},..., {Xn, 5}. If this trail is not a path, we have the situation {a, x)}, {x1, x2},...,
{Xe—1, Xkbs (Xe. Xe bs Xka1s Meg2}, (m1, Xm}, (ms Xm4i}. -- +» {Xn, b}, where
k<m and x, =X», possibly with k = 0 and a (= 2X9) = Xm, or m=n+1 and x =
b (= Xni1). But then we have a contradiction because {a, x;}, (x), X2},..., {xn-1, xx},
{Xm,Xm+i},.--», {Xn, BD} is a shorter trail from a to b.
The notion of a path is needed in the following graph property.
Definition 11.4 Let G = (V, E) be an undirected graph. We call G connected if there is a path between
any two distinct vertices of G.
Let G = (V, E) beadirected graph. Its associated undirected graph is the graph obtained
from G by ignoring the directions on the edges. If more than one undirected edge results
for a pair of distinct vertices in G, then only one of these edges is drawn in the associated
undirected graph. When this associated graph is connected, we consider G connected.
A graph that is not connected is called disconnected.
The graphs in Figs. 11.1, 11.3, and 11.4 are connected. In Fig. 11.2 the graph is not
connected because, for example, there is no path from a to e.
In Fig. 11.5 we have an undirected graph on V = {a, b,c, d, e, f, g}. This graph is not
EXAMPLE 11.3
connected because, for example, there is no path from a to e. However, the graph is com-
posed of pieces (with vertex sets V; = {a, b, c,d}, V2 = {e, f, g}, and edge sets Fy, =
{{a, b}, {a, c}, fa, d}, {b, d}}, Eo = {{e, ff}, Uf, g}}) that are themselves connected, and
these pieces are called the (connected) components of the graph. Hence an undirected
graph G = (V, £) is disconnected if and only if V can be partitioned into at least two
subsets V}, V2 such that there is no edge in E of the form {x, y}, where x € V; and y € V3.
A graph is connected if and only if it has only one component.
a
d f
Figure 11.5
Definition 11.5 For any graph G = (V, E), the number of components of G is denoted by «(G).
For the graphs in Figs. 11.1, 11.3, and 11.4, «(G) = | because these graphs are connected;
EXAMPLE 11.4
x(G) = 2 for the graphs in Figs. 11.2 and 11.5.
518 Chapter 11 An Introduction to Graph Theory
Before closing this first section, we extend our concept of a graph. Thus far we have
allowed at most one edge between two vertices; we now consider an extension.
Definition 11.6 Let V be a finite nonempty set. We say that the pair (V, £) determines a multigraph G with
vertex set V and edge set E" if, for some x, y € V, there are two or more edges in E of the
form (a) (x, y) (for a directed multigraph), or (b) {x, y} (for an undirected multigraph). In
either case, we write G = (V, E) to designate the multigraph, just as we did for graphs.
Figure 11.6 shows an example of a directed multigraph. There are three edges from a to
b, so we say that the edge (a, b) has multiplicity 3. The edges (b, c) and (d, e) both have
multiplicity 2. Also, the edge (e, d) and either one of the edges (d, e) form a (directed)
circuit of length 2 in the multigraph.
D
b
Figure 11.6
We shall need the idea of a multigraph later in the chapter when we solve the problem
of the seven bridges of K6nigsberg. (Note: Whenever we are dealing with a multigraph G,
we shall state explicitly that G is a multigraph.)
4. For n> 2, let G=(V, E) be the loop-free undirected
graph, where V is the set of binary n-tuples (of 0’s and 1’s)
and E = {{v, w}|v, we V and v, w differ in (exactly) two
1. List three situations, different from those in this section,
positions}. Find «(G).
where a graph could prove useful.
2. For the graph in Fig. 11.7, determine (a) a walk from b to 5. Let G = (V, E) be the undirected graph in Fig. 11.8. How
d that is not a trail; (b) a b-d trail that is not a path; (c) a path many paths are there in G from a to 4? How many of these
from b to d; (d) a closed walk from b to b that is not a circuit; paths have length 5?
(e) a circuit from b to b that is not a cycle; and (f) a cycle from a b - f
btob.
b e f
a C d g A
g Figure 11.8
c d
Figure 11.7
6. Ifa, b are distinct vertices in a connected undirected graph
3. For the graph in Fig. 11.7, how many paths are there from G, the distance from a to b is defined to be the length of a short-
bto f? est path from a to b (when a = b the distance is defined to be
"We now allow a set to have repeated elements in order to account for multiple edges. We realize that this is a
change from the way we dealt with sets in Chapter 3. To overcome this the term muitiset is often used to describe
E in this case.
11.1 Definitions and Examples 519
0). For the graph in Fig. 11.9, find the distances from d to (each if and only if its removal (the vertices a and b are left) does not
of) the other vertices in G. disconnect G.
C k £ 10. Give an example of a connected graph G where removing
q
any edge of G results in a disconnected graph.
d« g 11. Let G be a graph that satisfies the condition in Exercise 10.
m (a) Must G be loop-free? (b) Could G be a multigraph? (c) If
j
G has nv vertices, can we determine how many edges it has?
12. a) If G =(V, £) is an undirected graph with |V| = v,
q
e f A i |E| = e, and no loops, prove that 2e < v? — v.
Figure 11.9 b) State the corresponding inequality for the case when G
is directed.
7. Seven towns a, b,c, d, e, f, and g are connected by a sys- 13. Let G = (V, E) be an undirected graph. Define a relation
tem of highways as follows: (1) I-22 goes from a to c, passing Ron V bya KR bif a = b orif there is a path in G from a to b.
through b; (2) I-33 goes from c to d and then passes through b Prove that & is an equivalence relation. Describe the partition
as it continues to f; (3) 1-44 goes from d through e to a; (4) F-55 of V induced by &.
goes from f to b, passing through g; and (5) I-66 goes from g
14. a) Consider the three connected undirected graphs in
tod.
Fig. 11.11. The graph in part (a) of the figure consists
a) Using vertices for towns and directed edges for seg- of a cycle (on the vertices 4), u2, #3) and a vertex u4 with
ments of highways between towns, draw a directed graph edges (spokes) drawn from u, to the other three vertices.
that models this situation. This graph is called the wheel with three spokes and is
b) List the paths from g to a. denoted by W3. In part (b) of the figure we find the graph
c) What is the smallest number of highway segments that
would have to be closed down in order for travel from b to Ug
d to be disrupted?
d) Is it possible to leave town c and return there, visiting
each of the other towns only once?
e) What is the answer to part (d) if we are not required to
return to c? uy U3
f) Is it possible to start at some town and drive over each
of these highways exactly once? (You are allowed to visit a
town more than once, and you need not return to the town (a) W3
from which you started.) V2
8. Figure 11.10 shows an undirected graph representing a sec-
tion of a department store. The vertices indicate where cashiers
are located; the edges denote unblocked aisles between cashiers. Vy
The department store wants to set up a security system where v3
(plainclothes) guards are placed at certain cashier locations so
that each cashier either has a guard at his or her location or is
only one aisle away from a cashier who has a guard. What is
V4
the smallest number of guards needed? (b) W,
a D c
X3
x2
X4
h i
Figure 11.10
j k K\Y
xs
(c) Ws
9. Let G = (V, FE) bea loop-free connected undirected graph,
and let {a, b} be an edge of G. Prove that {a, b} is part of acycle Figure 11.11
520 Chapter 11 An Introduction to Graph Theory
W,—the wheel with four spokes. The wheel W; with five represented by the binary sequence 01. In parts (b), (c) of the
spokes appears in Fig. 11.11(c). Determine how many cy- figure we have the two unit-interval graphs determined by two
cles of length 4 there are in each of these graphs. unit intervals. When two unit intervals overlap [as in part (c)] an
b) In general, if n € Z* and n > 3, then the wheel with n edge is drawn in the unit-interval graph joining the vertices cor-
spokes is the graph made up of a cycle of length n together responding to these unit intervals. Hence the unit-interval graph
with an additional vertex that is adjacent to the n vertices in part (b) consists of the two isolated vertices vj, v2 that corre-
of the cycle. The graph is denoted by W,,. (i) How many spond with the nonoverlapping unit intervals. In part (c) the unit
cycles of length 4 are there in W,,? ii) How many cycles in intervals overlap so the corresponding unit-interval graph con-
W,, have length n? sists of a single edge joining the vertices v,, v2 (that correspond
to the given unit intervals). A closer look at the unit intervals in
15. For the undirected graph in Fig. 11.12, find and solve a re-
part (c) reveals how we can represent the positioning of these
currence relation for the number of closed v-v walks of length
intervals and the corresponding unit-interval graph by the bi-
n> 1,if we allow such a walk, in this case, to contain or consist
nary sequence 0011. In parts (d)—(f) of the figure we have three
On
of one or more loops.
of the unit-interval graphs for three unit intervals — together
with their corresponding binary sequences.
a) How many other unit-interval graphs are there for
three unit intervals? What are the corresponding binary se-
Figure 11.12 quences for these graphs?
16. Unit-Interval Graphs. For n => 1, we start with n closed in- b) How many unit-interval graphs are there for four unit
tervals of unit length and draw the corresponding unit-interval intervals?
graph on n vertices, as shown in Fig. 11.13. In part (a) of the c) For n > 1, how many unit-interval graphs are there for
figure we have one unit interval. This corresponds to the single n unit intervals?
vertex u; both the interval and the unit-interval graph can be
0 1 0 1 0 1 0 1
o——_—_- e———o ee 0 1a
¢——?! \
1 | |
| I | I
*u *, *V, 0 01 4
o——__e
Vy V2
(a) 01 (b) (c) 0011
0 1 0 1 0 1
0 1 0 0 1
0 i 0 0 |
o—_- o—_-_-+—_e o—__+—_e
W2
Wy; q——_@ e e o——_———-6
W, Ww W3 WwW, W2 W3
Ww3
(d) 000111 (e) 001101 (f) 010011
Figure 11.13
11.2
Subgraphs, Complements,
and Graph Isomorphism
In this section we shall focus on the following two ideas:
a) What types of substructures are present in a graph?
b) Is it possible to draw two graphs that appear distinct but have the same underlying
structure?
11.2. Subgraphs, Complements, and Graph Isomorphism 521
To answer the question in part (a) we introduce the following definition.
Definition 11.7 If G = (V, E) is a graph (directed or undirected), then G, = (V|, £)) is called a subgraph
of G if 6 # V; C V and E; C E, where each edge in £, is incident with vertices in Vi.
Figure 11.14(a) provides us with an undirected graph G and two of its subgraphs, G, and
G. The vertices a, b are isolated in subgraph G). Part (b) of the figure provides a directed
example. Here vertex w is isolated in the subgraph G’.
(G) (G,) (Gp) (G) (G’)
b b b 5 s
e
a Cc e a Cc e
@
t u V t
e
(a) d d d (b) WwW Ww
Figure 11.14
Certain special types of subgraphs arise as follows:
Definition 11.8 Given a (directed or undirected) graph G = (V, EF), let G, = (V,, E;) be a subgraph of G.
If V, = V, then G, is called a spanning subgraph of G.
In part (a) of Fig. 11.14 neither G; nor G2 is a spanning subgraph of G. The subgraphs
G3 and G4—shown in part (a) of Fig. 11.15 —are both spanning subgraphs of G. The
directed graph G’ in part (b) of Fig. 11.14 is a subgraph, but not a spanning subgraph, of
the directed graph G given in that part of the figure. In part (b) of Fig. 11.15 the directed
graphs G” and G”” are two of the 2* = 16 possible spanning subgraphs.
(G3) (Gq) (G"') (G’’')
Db S S
a C
en
e
u V t u
e
d Ww Ww
(a)
Figure 11.15
522 Chapter 11 An Introduction to Graph Theory
Definition 11.9 Let G = (V, E) be a graph (directed or undirected). If @ # U C V, the subgraph ofG
induced by U is the subgraph whose vertex set is U and which contains all edges (from G)
of either the form (a) (x, y), for x, y € U (when G is directed), or (b) {x, y}, forx, ye U
(when G is undirected). We denote this subgraph by (U).
A subgraph G’ of a graph G = (V, E) is called an induced subgraph if there exists
A~#U CV, where G’ = (U).
For the subgraphs in Fig. 11.14(a), we find that G2 is an induced subgraph of G but the
subgraph G, is not an induced subgraph because edge {a, d} is missing.
Let G = (V, E) denote the graph in Fig. 11.16(a). The subgraphs in parts (b) and (c) of the
EXAMPLE 11.5
figure are induced subgraphs of G. For the connected subgraph in part (b), G; = (U,) for
U, = {b, c, d, e}. In like manner, the disconnected subgraph in part (c) is G2 = (U2) for
Uy = {a, b, e, f}. Finally, G3 in part (d) of Fig. 11.16 is a subgraph of G. But it is not an
induced subgraph; the vertices c, e are in G3, but the edge {c, e} (of G) is not present.
(G) (G,) (G3) (G3)
C C C
b b b b
d e d e ) e e
a a a a Va
f f f
(a) (b) (c) (d)
Figure 11.16
Another special type of subgraph comes about when a certain vertex or edge is deleted
from the given graph. We formalize these ideas in the following definition.
Definition 11.10 Let v be a vertex in a directed or an undirected graph G = (V, E£). The subgraph of G
denoted by G — v has the vertex set V; = V — {v} and the edge set EF, C E, where E,
contains all the edges in FE except for those that are incident with the vertex v. (Hence
G — vis the subgraph of G induced by Vj.)
In a similar way, if e is an edge of a directed or an undirected graph G = (V, E), we
obtain the subgraph G — e = (V,, E)) of G, where the set of edges E, = E — {e}, and the
vertex set is unchanged (that is, V; = V).
Let G = (V, E) be the undirected graph in Fig. !1.17(a). Part (b) of this figure is the
EXAMPLE 11.6
subgraph G, (of G), where G; = G —c. It is also the subgraph of G induced by the set
of vertices U; = {a, b, d, f, g, h}, so G; = (V — {c}) = (U)). In part (c) of Fig. 11.17
we find the subgraph G2 of G, where G2 = G —e for e the edge {c, d}. The result in
Fig. 11.17(d) shows how the ideas in Definition 11.10 can be extended to the deletion of
more than one vertex (edge). We may represent this subgraph of G as G3; = (G — b) — f =
(G — f) -b=G — {b, f} = (U3), for U3 = {a, c, d, g, h}.
11.2. Subgraphs, Complements, and Graph Isomorphism 523
(G) (G)) (G>) (G3)
Cc Cc
g
d
f
h A A A
(a) (b) (c) (d)
Figure 11.17
The idea of a subgraph gives us a way to develop the complement of an undirected
loop-free graph. Before doing so, however, we define a type of graph that is maximal in
size for a given number of vertices.
Definition 11.11 Let V bea set of n vertices. The complete graph on V, denoted K,,, is a loop-free undirected
graph, where for all a, b € V, a # Bb, there is an edge {a, b}.
Figure 11.18 provides the complete graphs K,,, for 1 <n < 4. We shall realize, when we
examine the idea of graph isomorphism, that these are the only possible complete graphs
for the given number of vertices.
a a a b
a
e
Cc
b c b | d
(K;) (K) (K3) (Ka)
Figure 11.18
In determining the complement of a set in Chapter 3, we needed to know the universal
set under consideration. The complete graph plays a role similar to a universal set.
Definition 11.12 Let G be a loop-free undirected graph on n vertices. The complement of G, denoted G, is
the subgraph of K, consisting of the n vertices in G and all edges that are not in G. (If
G = K,,, G 1s a graph consisting of n vertices and no edges. Such a graph is called a null
graph.)
Figure 11.19(a) shows an undirected graph on four vertices. Its complement is shown in
part (b) of the figure. In the complement, vertex a is isolated.
Once again we have reached a point where many new ideas have been defined. To
demonstrate why some of these ideas are important, we apply them now to the solution of
an interesting puzzle.
524 Chapter 11 An Introduction to Graph Theory
d Cc d
(a) (b)
Figure 11.19
Instant Insanity. The game of Instant Insanity is played with four cubes. Each of the six
EXAMPLE 11.7 faces on a cube is painted with one of the colors red (R), white (W), blue (B), or yellow (Y).
The object of the game is to place the cubes in a column of four such that all four (different)
colors appear on each of the four sides of the column.
Consider the cubes in Fig. 11.20 and number them as shown. (These cubes are only one
example of this game. Many others exist.) First we shall estimate the number of arrange-
ments that are possible here. If we wish to place cube | at the bottom of the column, there
are at most three different ways in which we can do this. In Fig. 11.20 cube | is unfolded,
and we see that it makes no difference whether we place the red face on the table or the
opposite white face on the table. We are concerned only with the other four faces at the
base of our column. With three pairs of opposite faces there will be at most three ways
to place the first cube for the base of the column. Now consider cube 2. Although some
colors are repeated, no pair of opposite faces has the same color. Hence we have six ways
to place the second cube on top of the first. We can then rotate the second cube without
changing either the face on the top of the first cube or the face on the bottom of the second
cube. With four possible rotations we may place the second cube on top of the first in as
many as 24 different ways. Continuing the argument, we find that there can be as many as
(3)(24) (24) (24) = 41,472 possibilities to consider. And there may not even be a solution!
Y R
Wi RI] Y | W Bi; BIW IY
B Y
(1) (2)
R WwW
R|BtY|B Wi} R] BY
Ww WwW
(3) (4)
Figure 11.20 Figure 11.21
In solving this puzzle we realize that it is difficult to keep track of (1) colors on opposite
faces of cubes and (2) columns of colors. A graph (actually a labeled multigraph) helps us
to visualize the situation. In Fig. 11.21 we have a graph on four vertices R, W, B, and Y.
As we consider each cube, we examine its three pairs of opposite faces. For example, cube
11.2 Subgraphs, Complements, and Graph Isomorphism 525
1 has a pair of opposite faces painted yellow and blue, so we draw an edge connecting Y
and B and label it 1 (for cube 1). The other two edges in the figure that are labeled with 1
account for the pairs of opposite faces that are white and yellow, and red and white. Doing
likewise for the other cubes, we arrive at the graph in the figure. A loop, such as the one at
B, with label 3, indicates a pair of opposite faces with the same color (for cube 3).
In the graph we see a total of 12 edges falling into four sets of 3, according to the labels
for the cubes. At each vertex the number of edges incident to (or from) the vertex counts
the number of faces on the four cubes that have that color. (We count a loop twice.) Hence
Fig. 11.21 tells us that for our four cubes we have five red faces, seven white ones, six blue
ones, and six that are yellow.
With the four cubes stacked in a column, we examine two opposite sides of the column.
This arrangement gives us four edges in the graph of Fig. 11.21, where each label appears
once. Since each color is to appear only once on a side of the column, each color must
appear twice as an endpoint of these four edges. If we can accomplish the same result for
the other two sides of the column, we have solved the puzzle. In Fig. 11.22(a) we see that
each side in one pair of opposite sides of our column has the four colors if the cubes are
arranged according to the information provided by the subgraph shown there. However, to
accomplish this for the other two sides of the column also, we need a second such subgraph
that doesn’t use any edge in part (a). In this case a second such subgraph does exist, as
shown in part (b) of the figure.
(a) (b)
Figure 11.22
Figure 11.23 shows how to arrange the cubes as indicated by the subgraphs in Fig. 11.22.
Y B WwW R
W R | Y B R Y | 8 W
B Ww R Y
(1) (2) (3) (4)
Figure 11.23
In general, for any four cubes we construct a labeled multigraph and try to find two
subgraphs where (1) each subgraph contains all four vertices, and four edges, one for each
label; (2) in each subgraph, each vertex is incident with exactly two edges (a loop is counted
twice); and (3) no (labeled) edge of the labeled multigraph appears in both subgraphs.
Now we turn to the second question posed at the start of the section.
526 Chapter 11 An Introduction to Graph Theory
Parts (a) and (b) of Fig. 11.24 show two undirected graphs on four vertices. Since straight
edges and curved edges are considered the same here, each graph represents six adjacent
pairs of vertices. In fact, we probably feel that these graphs are both examples of the graph
K4. We make this feeling mathematically rigorous in the following definition.
a b Ww x m n r 5
c d y 2 p q t ul
(a) (b) (c) (d)
Figure 11.24
Definition 11.13 Let G; = (Vi, £1) and G2 = (V2, Ey) be two undirected graphs. A function f: V; > V2
is called a graph isomorphism if (a) f is one-to-one and onto, and (b) for all a, b € Vj,
{a, b} € E, if and only if { f(a), f(b)} € Er. When such a function exists, G,; and G2 are
called isomorphic graphs.
The vertex correspondence of a graph isomorphism preserves adjacencies. Since which
pairs of vertices are adjacent and which are not is the only essential property of an undirected
graph, in this way the structure of the graphs is preserved.
For the graphs in parts (a) and (b) of Fig. 11.24 the function f defined by
fl@=w, f(b) =x, fle) =y, f(d) =z
provides an isomorphism. [In fact, any one-to-one correspondence between {a, b, c, d} and
{w, x, ¥, z} will be an isomorphism because both of the given graphs are complete graphs.
This would also be true if each of the given graphs had only four isolated vertices (and no
edges).] Consequently, as far as (graph) structure is concerned, these graphs are considered
the same — each is (isomorphic to) the complete graph K4.
For the graphs in parts (c) and (d) of Fig. 11.24 we need to be a little more careful. The
function g defined by
g(m) =r, a(n) =s, g(p) =t, g(q) =u
is one-to-one and onto (for the given vertex sets). However, although {m, g} is an edge in the
graph of part (c), {g(m), g(q)} = {r, u} is not an edge in the graph of part (d). Consequently,
the function g does not define a graph isomorphism. To maintain the correspondence of
edges, we consider the one-to-one onto function # where
h(m) =s, h(n) =1, h(p) =u, h(q) =t.
In this case we have the edge correspondences
{m,n} <> {h(m), h(n)} = {s, r}, {n, gq} > {A(n), h(qg)} = {r, th,
{m, p} > {h(m), h(p)} = {s, u}, {p,q} <= {h(p), h(q)} = {u, th,
{m, q} <> {h(m), h(q)} = {s, 8},
11.2 Subgraphs, Complements, and Graph lsomorphism 527
so h is a graph isomorphism. [We also notice how, for example, the cyclem > n > q>m
corresponds with the cycle s (= h(m)) > r (= h(n)) > t (= h(g)) > 5 (= A(m)).]
Finally, since the graph in part (a) of Fig. 11.24 has six edges and that in part (c) has
only five edges, these two graphs cannot be isomorphic.
Now let us examine the idea of graph isomorphism in a more difficult situation.
In Fig. 11.25 we have two graphs, each on ten vertices. Unlike the graphs in Fig. 11.24, it
EXAMPLE 11.8
is not immediately apparent whether or not these graphs are isomorphic.
ay | 7G
PRI | YE)
C v 5
Figure 11.25
One finds that the correspondence given by
aq cou e—>r gx i—>zZ
b> v d—>y frw h-t jwvs
preserves all adjacencies. For example, { f, 4} is an edge in graph (a) with {w, t} the cor-
responding edge in graph (b). But how did we come up with the correspondence? The
following discussion provides some clues.
We note that because an isomorphism preserves adjacericies, it preserves graph sub-
structures such as paths and cycles. In graph (a) the edges {a, f}, {f, i}, {i, d}, {d, e},
and {e, a} constitute a cycle of length 5. Hence we must preserve this as we try to find an
isomorphism. One possibility for the corresponding edges in graph (b) is {q¢, w}, {w, z},
{z, y}, {y, r}, and {r, g}, which also provides a cycle of length 5. (A second possible
choice is given by the edges in the cycle y > r > s >t —>u— y.) In addition, start-
ing at vertex a in graph (a), we find a path that will “visit” each vertex only once. We
express this path bya > f ~h+>c~>+b—>g- j >e-d - i. For the graphs to be
isomorphic there must be a corresponding path in graph (b). Here the path described by
q>wotoeu>v 3% x>s—>r- y- Zis the counterpart.
These are some of the ideas we can use to try to develop an isomorphism and deter-
mine whether two graphs are isomorphic. Other considerations will be discussed through-
out the chapter. However, there is no simple, foolproof method — especially when we are
confronted with larger graphs G, = (V;, £,) and G2 = (V2, F2), where |V;| = |V2| and
|E\| = |Eo|.
We close this section with one more example involving graph isomorphism.
528 Chapter 11 An Introduction to Graph Theory
Each of the two graphs in Fig. 11.26 has six vertices and nine edges. Therefore it is reason-
EXAMPLE 11.9 able to ask whether they are isomorphic.
In graph (a), vertex a is adjacent to two other vertices of the graph. Consequently, if
we try to construct an isomorphism between these graphs, we should associate vertex a
with a comparable vertex in graph (b), say vertex u. A similar situation exists for vertex d
and either vertex x or vertex z. But no matter which of the vertices x or z we use, there
remains one vertex in graph (b) that is adjacent to two other vertices. And there is no other
such vertex in graph (a) to continue our one-to-one structure preserving correspondence.
Consequently, these graphs are not isomorphic.
Furthermore, in graph (b) it is possible to start at any vertex and find a circuit that includes
every edge of the graph. For example, if we start at vertex u, the circuit u—> w—>v—>
yowroz>yox>v > u exhibits this property. This does not happen in graph (a)
where the only trails that include each edge start at either b or f and then terminate at f or
b, respectively.
(a) (b)
Figure 11.26
d) Draw the subgraph of G induced by the set of vertices
EXERCISES 11.2 U = {b, c,d, f, i, j}.
1. Let G be the undirected graph in Fig. 11.27(a). e) For the graph G, let the edge e = {c, f}. Draw the sub-
graph G — e.
a) How many connected subgraphs of G have four vertices
and include a cycle? . a) Let G = (V, E) be an undirected graph, with G; =
(V,, £,) a subgraph of G. Under what condition(s) is G,
b) Describe the subgraph G, (of G) in part (b) of the fig-
not an induced subgraph of G?
ure first, as an induced subgraph and second, in terms of
b) For the graph G in Fig. 11.27(a), find a subgraph that is
deleting a vertex of G.
not an induced subgraph.
c) Describe the subgraph G2 (of G) in part (c) of the figure
first, as an induced subgraph and second, in terms of the
. a) How many spanning subgraphs are there for the graph
deletion of vertices of G. G in Fig. 11.27(a)?
(G) (G,) (G>)
b be
f
J A
(b) (c)
Figure 11.27
11.2 Subgraphs, Complements, and Graph Isomorphism 529
1
R 4 Ww
12
4|2 39 1|3
B 3 Y
4
(c)
Figure 11.28
b) How many connected spanning subgraphs are there in a 5
part (a)?
c) How many of the spanning subgraphs in part (a) have
vertex a as an isolated vertex?
4. If G = (V, E) is an undirected graph, how many spanning “> u 7
subgraphs of G are also induced subgraphs?
5. Let G = (V, E) be an undirected graph, where |V| > 2. If
every induced subgraph of G is connected, can we identify the
graph G?
A z
6. Find all (loop-free) nonisomorphic undirected graphs with
(a)
four vertices. How many of these graphs are connected?
7. Each of the labeled multigraphs in Fig. 11.28 arises in the a u Vv
>
analysis of a set of four blocks for the game of Instant Insanity.
In each case determine a solution to the puzzle, if possible. bx
8. a) How many paths of length 4 are there in the complete f b
graph K7? (Remember that a path such as v) > v2 >
V3 —> U4 — Us is considered to be the same as the path e c
Us —> U4 —> U3 —> V2 > Vy.)
b) Let m,n € Z* with m <n. How many paths of length ,
d y Z
m are there in the complete graph K,,?
(b)
9, For each pair of graphs in Fig. 11.29, determine whether or
not the graphs are isomorphic. Figure 11.29
10. Let G be an undirected (loop-free) graph with v vertices
and e edges. How many edges are there in G?
11. a) If G,, G are (loop-free) undirected graphs, prove that
G,, G» are isomorphic if and only if G,, G2 are isomor-
phic.
b) Determine whether the graphs in Fig. 11.30 are isomor-
phic.
12. a) Let G be an undirected graph with n vertices. If G is iso-
morphic to its own complement G, how many edges must
G have? (Such a graph is called self-complementary.)
b) Find an example of a self-complementary graph on four Figure 11.30
vertices and one on five vertices.
c) If G is aself-complementary graph on 7 vertices, where
n> 1, prove thatn = 4k orn = 4k + 1, for somek € Z*. 14. a) Find a graph G where both G and G are connected.
13. Let G be a cycle on 7 vertices. Prove that G is self- b) If G is a graph on 7 vertices, for n > 2, and G is not
complementary if and only ifn = 5. connected, prove that G is connected.
530 Chapter 11 An Introduction to Graph Theory
15. a) Extend Definition 11.13 to directed graphs.
b) Determine whether the directed graphs in Fig. 11.31 are
isomorphic.
16. a) How many subgraphs H = (V, E) of Kg satisfy |V| =
3? (If two subgraphs are isomorphic but have different ver- d
tex sets, consider them distinct.)
b) How many subgraphs H = (V, E) of Kg satisfy
|V| = 4? e
c) How many subgraphs does K have? Figure 11.31
d) For n > 3, how many subgraphs does K,, have?
17. Let v, w be two vertices in K,, n > 3. How many walks of
length 3 are there from v to w?
11.3
Vertex Degree: Euler Trails and Circuits
In Example [1.9 the number of edges incident with a vertex was used to show that two
undirected graphs were not isomorphic. We now find this idea even more helpful.
Definition 11.14 Let G be an undirected graph or multigraph. For each vertex v of G, the degree of v, written
deg(v), is the number of edges in G that are incident with v. Here a loop at a vertex v is
considered as two incident edges for v.
For the graph in Fig. 11.32, deg(b) = deg(d) = deg(f) = deg(g) = 2, deg(c) = 4,
| EXAMPLE 11.10
deg(e) = 0, and deg(h) = 1. For vertex a we have deg(a) = 3 because we count a loop
twice. Since h has degree 1, it is called a pendant vertex.
Figure 11.32
Using the idea of vertex degree, we have the following result.
THEOREM 11.2 If G = (V, E) is an undirected graph or multigraph, then uev deg(v) = 2\E|.
Proof: As we consider each edge {a, b} in graph G, we find that the edge contributes a count
of 1 to each of deg(a), deg(b), and consequently a count of 2 to Yo vev deg(v). Thus 2|E|
accounts for deg(v), for all v € V, and yO nev deg(v) = 2|/E|.
11.3 Vertex Degree: Euler Trails and Circuits 531
This theorem provides some insight into the number of odd-degree vertices that can exist
in a graph.
COROLLARY 11.1 For any undirected graph or multigraph, the number of vertices of odd degree must be even.
Proof: We leave the proof for the reader.
We apply Theorem 11.2 in the following example.
EXAMPLE 11.11 | An undirected graph (or multigraph) where each vertex has the same degree is called a
regular graph. If deg(v) = k for all vertices v, then the graph is called k-regular. Is it
possible to have a 4-regular graph with 10 edges?
From Theorem 11.2, 2|E| = 20 = 4|V]|, so we have five vertices of degree 4. Figure
11.33 provides two nonisomorphic examples that satisfy the requirements.
(a) (b)
Figure 11.33
If we want each vertex to have degree 4, with 15 edges in the graph, we find that
2|E| = 30 = 4|V|, from which it follows that no such graph 1s possible.
Our next example introduces a regular graph that arises in the study of computer archi-
tecture.
The Hypercube. In order to build a parallel computer one needs to have multiple CPUs
EXAMPLE 11.12
(central processing units), where each such processor works on part of a problem. But often
we cannot actually decompose a problem completely, so at some point the processors (each
with its own memory) have to be able to communicate with one another.
We envisage this situation as follows. The accumulated data for a given problem are
taken from a central storage location and divided up among the processors. The processors
go through a phase where each computes on its own for a certain period of time and then
some intercommunication takes place. Then the processors return to computing on their
own and continue back and forth between operating individually and communicating with
one another. This situation adequately describes how parallel algorithms work in practice.
To model the communication between the processors we use a loop-free connected
undirected graph where each processor is assigned a vertex. When two processors, say p,
P2, are able to communicate directly with one another we draw the edge {p), p2} to represent
this (line of) possible communication. How can we decide on a model (that is, a graph) to
speed up the processing time? The complete graph (on all of our processors as vertices)
532 Chapter 11 An Introduction to Graph Theory
would be ideal — but prohibitively expensive because of all the necessary connections. On
the other hand, one can connect 7 processors along a path with n — 1 edges or on a cycle
with n edges. Another possible model is a grid (or, mesh) graph, examples of which are
shown in Fig. [1.34.
Py P2 P3 Pa Ps Dy D2 P3 Da
P56 P7 Pg Po P10 Ds Pe P7 Dg
Do Pio P14 Pi2
P14 P12 P43 P14 P1s
P43 Pi4 Pis Pig
(a) Two-by-four grid (b) Three-by-three grid
Figure 11.34
But in these last three models the distances (as measured by the number of edges in
the shortest paths) between pairs of processors get longer and longer as the number of
processors increases. A compromise that weighs the number of edges (direct connections)
against the distance between pairs of vertices (processors) is embodied in the regular graph
called the hypercube.
For n € N, the n-dimensional hypercube (or n-cube) is denoted by Q,. It is a loop-free
connected undirected graph with 2” vertices. For n > 1, these vertices are labeled by the
2” n-bit sequences representing 0, 1, 2,..., 2” — 1. For instance, Q3 has eight vertices—
labeled 000, 001, 010, OIL, 100, 101, 110, and 111. Two vertices v), v2 of Q, are joined
by the edge {v,, v2} when the binary labels for v,, v2 differ in exactly one position. Then
for any vertices u, w in Q, there is a shortest path of length d, when d is the number of
positions where the binary labels for u, w differ. [This insures that Q,, is connected. }
Figure 11.35 shows Q, forn = 0, 1, 2, 3. In general, forn > 0, Q,41 can be constructed
recursively from two copies of Q, as follows. Prefix the vertex labels of one copy of Q,
with 0 (call the result Qy,,) and those of the other copy with | (call this result Q.,). Forx in
Qo.n and y in Q,, draw the edge {x, y} if the (newly prefixed) binary labels for x, y differ
only in the first (newly prefixed) position. The case forn = 3 (son + 1 = 4) is demonstrated
in Fig. 11.36. The blue edges are the new edges described above for constructing Q4 from
two copies of Q3.
011 111
010 110
0 00 10 000 100
001 101
Qo | Q2 Q3
Figure 11.35
11.3. Vertex Degree: Euler Trails and Circuits 533
0011 0111 1011 1111
0010 0110 1010 1110
ee aa ee
0000 1100
0100 1000
0001 0101 1001 1101
Figure 11.36
In summary, we reiterate that for n € N, the hypercube Q, is an n-regular loop-free
undirected graph with 2” vertices. Further, it is connected with the distance between any
two vertices at most n. From Theorem 11.2 it follows that Q, has (1/2)n2" = n2"—! edges.
[Referring back to Example 10.33, we find that 2"! is likewise the number of edges for the
Hasse diagram of the partial order (P(X,), ©), where X, = {1, 2,3,..., nj and P(X,)
is the power set of X,,. This is no mere coincidence! If we use the Gray code of Example
3.9 to label the vertices of this Hasse diagram, we find we have the hypercube Q,.]
Finally, note that in Q, there are 16 vertices (processors) and the longest distance between
vertices is 4. Contrast this with the grids in Fig. 11.34, where there are 15 vertices in part
(a) and 16 in part (b) — yet the longest distance is 6 in both grids.
We turn now to the reason why Euler developed the idea of the degree of a vertex: to
solve the problem dealing with the seven bridges of K6nigsberg.
The Seven Bridges of Kénigsberg. During the eighteenth century, the city of Konigsberg
EXAMPLE 11.13
(in East Prussia) was divided into four sections (including the island of Kneiphof) by the
Pregel River. Seven bridges connected these regions, as shown in Fig. 11.37(a). It was said
that residents spent their Sunday walks trying to find a way to walk about the city so as to
cross each bridge exactly once and then return to the starting point.
IT Ar
ae
(a) (b)
Figure 11.37
In order to determine whether or not such a circuit existed, Euler represented the four
sections of the city and the seven bridges by the multigraph shown in Fig. 11.37(b). Here
534 Chapter 11 An Introduction to Graph Theory
he found four vertices with deg(a) = deg(c) = deg(d) = 3 and deg(b) = 5. He also found
that the existence of such a circuit depended on the number of vertices of odd degree in the
graph.
Before proving the general result, we give the following definition.
Definition 11.15 Let G = (V, £) be an undirected graph or multigraph with no isolated vertices. Then G is
said to have an Euler circuit if there is a circuit in G that traverses every edge of the graph
exactly once. If there is an open trail from a to b in G and this trail traverses each edge in
G exactly once, the trail is called an Euler trail.
The problem of the seven bridges is now settled as we characterize the graphs that have
an Euler circuit.
THEOREM 11.3 Let G = (V, E) be an undirected graph or multigraph with no isolated vertices. Then G
has an Euler circuit if and only if G is connected and every vertex in G has even degree.
Proof: If G has an Euler circuit, then for alla, b € V there is a trail froma to b — namely, that
part of the circuit that starts at a and terminates at b. Therefore, it follows from Theorem 11.1
that G is connected.
Let s be the starting vertex of the Euler circuit. For any other vertex v of G, each time
the circuit comes to v it then departs from the vertex. Thus the circuit has traversed either
two (new) edges that are incident with v or a (new) loop at v. In either case a count of
2 is contributed to deg(v). Since v is not the starting point and each edge incident to v
is traversed only once, a count of 2 is obtained each time the circuit passes through v, so
deg(v) is even. As for the starting vertex s, the first edge of the circuit must be distinct from
the last edge, and because any other visit to s results in a count of 2 for deg(s), we have
deg(s) even.
Conversely, let G be connected with every vertex of even degree. If the number of edges
in G is | or 2, then G must be as shown in Fig. 11.38. Euler circuits are immediate in these
cases. We proceed now by induction and assume the result true for all situations where there
are fewer than n edges. If G has n edges, select a vertex s in G as a starting point to build an
Euler circuit. The graph (or multigraph) G is connected and each vertex has even degree,
so we can at least construct a circuit C containing s. (Verify this by considering the longest
trail in G that starts at s.) Should the circuit contain every edge of G, we are finished. If
not, remove the edges of the circuit from G, making sure to remove any vertex that would
become isolated. The remaining subgraph K has all vertices of even degree, but it may not
be connected. However, each component of K is connected and will have an Euler circuit.
(Why?) In addition, each of these Euler circuits has a vertex that is on C. Consequently,
starting at s we travel on C until we arrive at a vertex s, that is on the Euler circuit of a
a a a
Figure 11.38
11.3. Vertex Degree: Euler Trails and Circuits 535
component C, of K. Then we traverse this Euler circuit and, returning to s;, continue on
C until we reach a vertex s> that is on the Euler circuit of component C2 of K. Since the
graph G is finite, as we continue this process we construct an Euler circuit for G.
Should G be connected and not have too many vertices of odd degree, we can at least
find an Euler trail in G.
COROLLARY 11.2 If G is an undirected graph or multigraph with no isolated vertices, then we can construct
an Euler trail in G if and only if G is connected and has exactly two vertices of odd degree.
Proof: If G is connected and a and b are the vertices of G that have odd degree, add an
additional edge {a, b} to G. We now have a graph G;, that is connected and has every vertex
of even degree. Hence G, has an Euler circuit C, and when the edge {a, b} is removed from
C, we obtain an Euler trail for G. (Thus the Euler trail starts at one of the vertices of odd
degree and terminates at the other odd vertex.) We leave the details of the converse for the
reader.
Returning now to the seven bridges of K6nigsberg, we realize that Fig. [1.37(b) is a
connected multigraph, but it has four vertices of odd degree. Consequently, it has no Euler
trail or Euler circuit.
Now that we have seen how the solution of an eighteenth-century problem led to the
start of graph theory, is there a somewhat more contemporary context in which we might
be able to apply what we have learned?
To answer this question (in the affirmative), we shall state the directed version of Theo-
rem 11.3. But first we need to refine the concept of the degree of a vertex.
Definition 11.16 Let G = (V, E) be a directed graph or multigraph. For each v € V,
a) The incoming, or in, degree of v is the number of edges in G that are incident into v,
and this is denoted by id(v).
b) The outgoing, or out, degree of v is the number of edges in G that are incident from
v, and this is denoted by od(v).
For the case where the directed graph or multigraph contains one or more loops, each
loop at a given vertex v contributes a count of | to each of id(v) and od(v).
The concepts of the in degree and the out degree for vertices now lead us to the following
theorem.
THEOREM 11.4 Let G = (V, F) be a directed graph or multigraph with no isolated vertices. The graph G
has a directed Euler circuit if and only if G is connected and id(v) = od(v) forall v € V.
Proof: The proof of this theorem is left for the reader.
At this time we consider an application of Theorem !1.4. This example is based on a
telecommunication problem given by C. L. Liu on pages 176—1!78 of reference [23].
536 Chapter 11 An Introduction to Graph Theory
In Fig. 11.39(a) we have the surface of a rotating drum that is divided into eight sectors of
EXAMPLE 11.14
equal area. In part (b) of the figure we have placed conducting (shaded sectors and inner cir-
cle) and nonconducting (unshaded sectors) material on the drum. When the three terminals
(shown in the figure) make contact with the three designated sectors, the nonconducting
material results in no flow of current and a ! appears on the display of a digital device.
For the sectors with the conducting material, a flow of current takes place and a 0 appears
on the display in each case. If the drum were rotated 45 degrees (clockwise), the screen
would read 110 (from top to bottom). So we can obtain at least two (namely, 100 and 110)
of the eight binary representations from 000 (for 0) to 111 (for 7). But can we represent all
eight of them as the drum continues to rotate? And could we extend the problem to the 16
four-bit binary representations from 0000 through 1111, and perhaps generalize the results
even further?
(a) (b)
Figure 11.39
To answer the question for the problem in the figure, we construct a directed graph
G = (V, E), where V = {00, 01, 10, 11} and
E is constructed as follows: If b\b2, b2b3 € V,
draw the edge (b;b2, b2b3). This results
in the directed graph of Fig. 11.40(a), where |£| = 8.
We see that this graph is connected and that for all v € V, id(v) = od(v). Consequently,
by Theorem 11.4, it has a directed Euler circuit. One such circuit is given by
100 000 001 O10 101 O11 111
cl? > 00 > 00 » 01 ——> 10 > Ol > {1 > 1]
)
110
Here the label on each edge e = (a, c), as shown in part (b) of Fig. 11.40, is the three-bit
sequence x|x2x3, where a = x; x2 andc = x2x3. Since the vertices of G are the four distinct
two-bit sequences 00, 01, 10, and 11, the labels on the eight edges of G determine the eight
distinct three-bit sequences. Also, any two consecutive edge labels in the Euler circuit are
of the form y; y2y3 and y2y3y4.
Starting with the edge label 100, in order to get the next label, 000, we concatenate the
last bit in 000, namely 0, to the string 100. The resulting string 1000 then provides 100
(1000) and 000 (1000). The next edge label is 001, so we concatenate the 1 (the last bit in
001) to our present string 1000 and get 10001, which provides the three distinct three-bit
sequences 100 (10001), 000 (10001 ), and 001 (10001). Continuing in this way, we arrive at
the eight-bit sequence 10001011 (where the last 1 is wrapped around), and these eight bits
are then arranged in the sectors of the rotating drum as in Fig. 11.41. It is from this figure
that the result in Fig. 11.39(b) is obtained. And as the drum in Fig. 11.39(b) rotates, all of
the eight three-bit sequences 100, 110, 111, O11, 101, 010, 001, and 000 are obtained.
11.3 Vertex Degree: Euler Trails and Circuits 537
01 10
Start
ir’)
11
Ws
(a)
Figure 11.40 Figure 11.41
In closing this section, we wish to call the reader’s attention to reference [24] by Anthony
Ralston. This article is a good source for more ideas and generalizations related to the
problem discussed in Example 11.14.
a b
1. Determine | V| for the following graphs or multigraphs G. c
a) G has nine edges and all vertices have degree 3.
d e
b) G is regular with 15 edges.
c) G has 10 edges with two vertices of degree 4 and all
others of degree 3. f
2. If G=(V, E) is a connected graph with |£| = 17 and g A
deg(v) > 3 for all v € V, what is the maximum value for | V|?
G, =(V, F,)
3. Let G = (V, E) be aconnected undirected graph.
a) What is the largest possible value for |V| if |E| = 19 5 t
and deg(v) > 4 for allu € V?
u
b) Draw a graph to demonstrate each possible case in
part (a).
V w
4, a) Let G = (V, E) be a loop-free undirected graph, where
|V| = 6 and deg(v) = 2 for all v € V, Up to isomorphism
how many such graphs G are there? xX
b) Answer part (a) for |V| = 7. y Z
c) Let G; = (V), £)) be a loop-free undirected 3-regular G> = (V>, E>)
graph with |V|| = 6. Up to isomorphism how many such
Figure 11.42
graphs G, are there?
d) Answer part (c) for |V,| = 7 and G, 4-regular. b) Find the degree of each vertex in V;. Do likewise for
e) Generalize the results in parts (c) and (d). each vertex in V>.
5. Let G, = (V;, F,) and G2 = (V2, Ex) be the loop-free c) Are the graphs G; and G» isomorphic?
undirected connected graphs in Fig. 11.42. 6. Let V = {a,b, c,d, e, f}. Draw three nonisomorphic
a) Determine |V;|, ||, |V2|, and |Z]. loop-free undirected graphs G, = (V, E;), G2 = (V, E>), and
538 Chapter 11 An Introduction to Graph Theory
G3 = (V, £3), where, in all three graphs, we have deg(a) = 3, 15. For all k ¢ Z* where k > 2, prove that there exists a loop-
deg(b) = deg(c) = 2, and deg(d) = deg(e) = deg(f) = 1. free connected undirected graph G = (V, E), where |V| = 2k
7. a) How many different paths of length 2 are there in the and deg(v) = 3 for all v € V.
undirected graph G in Fig. 11.43? 16. Prove that for each n € Z* there exists a loop-free con-
b) Let G = (V, E) be a loop-free undirected graph, where nected undirected graph G =(V, EF), where |V| = 2n and
V = {vy, v2,..., v,} and deg(v,) = d,, for all 1 <i <n. which has two vertices of degree i for every 1 <i <n.
How many different paths of length 2 are there in G?
17. Complete the proofs of Corollaries 11.1 and 11.2.
18. Let k be a fixed positive integer and let G = (V, EF) be
a loop-free undirected graph, where deg(v) > & for all v € V.
Prove that G contains a path of length k.
19. a) Explain why it is not possible to draw a loop-free con-
nected undirected graph with eight vertices, where the de-
grees of the vertices are 1, 1, 1, 2, 3, 4, 5, and 7.
b) Give an example of a loop-free connected undirected
Figure 11.43 multigraph with eight vertices, where the degrees of the
vertices are 1, 1, 1, 2,3, 4,5, and 7.
8. a) Find the number of edges in Qx. 20. a) Find an Euler circuit for the graph in Fig. 11.44.
b) Find the maximum distance between pairs of vertices b) If the edge {d, e} is removed from this graph, find an
in Qg. Give an example of one such pair that achieves this Euler trail for the resulting subgraph.
distance.
c) Find the length of a longest path in Qs. a b c
9. a) What is the dimension of the hypercube with 524,288
edges?
b) How many vertices are there for a hypercube with d . f Dg
4,980,736 edges?
10. For n € Z*, how many distinct (though isomorphic) paths
of length 2 are there in the n-dimensional hypercube Q,,? h j j k
11. Let n € Z*, with n > 9. Prove that if the edges of K, can Figure 11.44
be partitioned into subgraphs isomorphic to cycles of length
4 (where any two such cycles share no common edge), then 21. Determine the value(s) of x for which the complete graph
= 8k +1 forsomek € Z*, K,, has an Euler circuit. For which» does X,, have an Euler trail
but not an Euler circuit?
12. a) Forn > 2, let V denote the vertices in Q,. For 1 <k <
&£ <n, define the relation AR on V as follows: If w, x € V, 22. For the graph in Fig. 11.37(b), what is the smallest number
then w R x if w andx have the same bit (0, or 1) in position of bridges that must be removed so that the resulting subgraph
k and the same bit (0, or 1) in position @ of their binary la- has an Euler trail but not an Euler circuit? Which bridge(s)
bels. [For example, ifn = 7 andk = 3,£ = 6, then 1100010 should we remove?
FR 0000011.) Show that &R is an equivalence relation. How
many blocks are there for this equivalence relation? How 23. When visiting a chamber of horrors, Paul and David try to
many vertices are there in each block? Describe the sub- figure out whether they can travel through the seven rooms and
graph of Q,, induced by the vertices in each block. surrounding corridor of the attraction without passing through
any door more than once. If they must start from the starred po-
b) Generalize the results of part (a).
sition in the corridor shown in Fig. 11.45, can they accomplish
13. If G is an undirected graph with n vertices and e edges, let their goal?
§ = min cy {deg(v)} and let A = max,cy {deg(v)}. Prove that
6 <2(e/n) <A. 24. Let G = (V, E) be a directed graph, where |V| = 7” and
|E| = e. What are the values for )> ,-y id(v) and )> .-y od(v)?
14. Let G = (V, E), H = (V’, E’) be undirected graphs with
f:V— V’ establishing an isomorphism between the graphs. 25. a) Find the maximum length of a trail in
(a) Prove that f~': V’ > V is also an isomorphism for G and i) Ks ii) Kg
H.(b) Ifa € V, prove that deg(a@) (in G) = deg(f(a)) (in A). iii) Kio iv) Kay, ne Zt
11.3. Vertex Degree: Euler Trails and Circuits 539
Pf ae
If E = {e), e2,..., eg}, the incidence matrix I is then X k
matrix (b,;)nxx Where b,, = 1 if v, is a vertex on the edge e,,
otherwise b,, = 0.
a) Find the adjacency and incidence matrices associated
Lt 1
with the graph in Fig. 11.46.
b) Calculating A? and using the Boolean operations where
04+0=0,0+1=14+0=1+4+1=1,and0-0=0-1=
Figure 11.45 1-0 =0, 1-1 = 1, prove that the entry in row / and col-
umn j of A? is 1 if and only if there is a walk of length 2
b) Find the maximum length of a circuit in between the ith and jth vertices of V.
i) Kg ii) Ky c) If we calculate A” using ordinary addition and multipli-
lil) Kio iv) Kone Zt cation, what do the entries in the matrix reveal about G?
26. a) Let G = (V, F) be adirected graph or multigraph with d) What is the column sum for each column of A? Why?
no isolated vertices. Prove that G has a directed Euler cir-
e) What is the column sum for each column of /? Why?
cuit if and only if G is connected and od(v) = id(v) for all
veV.
b) A directed graph is called strongly connected if there
e
is a directed path from a to b for all vertices a, b, where
a # b. Prove that if a directed graph has a directed Euler 4
circuit, then it is strongly connected. Is the converse true? &7
27, Let G be a directed graph on n vertices. If the associ- e es &6 Vs,
ated undirected graph for G is K,,, prove that )* .y[od(v)’ =
Y evlid(v)P. &9
V3 e
28. IfG = (V, £)isadirected graph or multigraph with no iso- 8 M4 ©10
lated vertices, prove that G has a directed Euler trail if and only en
if (i) G is connected; (1i) od(v) = id(v) for all but two vertices Figure 11.46
x, yin V; and (iii) od(x) = id(x) + 1, id(y) = od(y) + 1.
29. Let V = {000, 001,010, ..., 110, 141}. For each four-bit
sequence b\b.b3b4 draw an edge from the element b, bb; to 33. Determine whether or not the loop-free undirected graphs
the element b2b3b, in V. (a) Draw the graph G = (V, EF) as with the following adjacency matrices are isomorphic.
described. (b) Find a directed Euler circuit for G. (c) Equally 00 1 Oo 4 ]
space eight 0’s and eight 1°s around the edge of a rotating (clock- a) 10 0 1 1 0 0
wise) drum so that these 16 bits form a circular sequence where fi 1 0 10 0
the (consecutive) subsequences of length 4 provide the binary
representations of 0, 1, 2,..., 14, 15 in some order. ro 1 014) fO 11 414
10411 1010
30. Carolyn and Richard attended a party with three other mar-
b) 01 0 a;7}1 4°01
ried couples. At this party a good deal of handshaking took
fi 1 1 Of [1 O 1 OF
place, but (1) no one shook hands with her or his spouse; (2) no
one shook hands with herself or himself; and (3) no one shook TO 61l6d1l 61d] hdcOlh1 Ol
hands with anyone more than once. Before leaving the party, ' 101 0 101 0
Carolyn asked the other seven people how many hands she or Ol 1 0 O10 101
he had shaken. She received a different answer from each of the f1 0 0 Of [1 0 1 O01
seven. How many times did Carolyn shake hands at this party? 34. Determine whether or not the loop-free undirected graphs
How many times did Richard? with the following incidence matrices are isomorphic.
31. Let G = (V, FE) bea loop-free connected undirected graph 1 0 1 0 1 |
with |V| > 2. Prove that G contains two vertices v, w, where a)]O 1 1 1 1 0
deg(v) = deg(w). 1 1 0 1 0 ]
32. If G = (V, E) is an undirected graph with |V| = 7” and 101 1 i1001
|E| =k, the following matrices are used to represent G. 1 1 0 0 1 10 0
Let V = {v1, v2,..., v,}. Define the adjacency matrix A = Dy )o 1 1 of loi 10
(4,; nxn Where a,, = 1 if {u,, v,} € FE, otherwise a,, = 0. 0 0 0 1 00 1 1
540 Chapter 11 An Introduction to Graph Theory
0 0 0 ] 1 1 0 0 cal levels: left, or first (000), second (001), third (011), fourth
C
1 1 0 1 0 1 1 =O (010), and right, or fifth (110). Use the elements of A x B to
—
c) 1 0 1
0 |? 0 0 0 1 label the 15 processors of this grid; for example, p, is labeled
—
0 1 1 0 10 1 1 0 (00,000), pz is labeled (00, 001), pg is labeled (01,011), pry is
35. There are 15 people at a party. Is it possible for each of labeled (11, 010), and pjs is labeled (11, 110). Show that the
these people to shake hands with (exactly) three others? two-by-four grid is isomorphic to a subgraph of the hypercube
36. Consider the two-by-four grid in Fig. 11.34. Assign the par- Qs. (Thus we can consider the two-by-four grid to be embedded
tial Gray code A = {00, 01, 11} to the three horizontal levels: in the hypercube Qs.)
top (00), middle (01), and bottom (11). Now assign the par- 37. Prove that the three-by-three grid of Fig. 11.34 is isomor-
tial Gray code B = {000, 001, 011, 010, 110} to the five verti- phic to a subgraph of the hypercube Q4.
11.4
Planar Graphs
On aroad map the lines indicating the roads and highways usually intersect only at junctions
or towns. But sometimes roads seem to intersect when one road is located above another,
as in the case of an overpass. In this case the two roads are at different levels, or planes.
This type of situation leads us to the following definition.
Definition 11.17 A graph (or multigraph) G is called planar if G can be drawn in the plane with its edges
intersecting only at vertices of G. Such a drawing of G is called an embedding of G in the
plane.
The graphs in Fig. 11.47 are planar. The first is a 3-regular graph, because each vertex has
EXAMPLE 11.15
degree 3; it is planar because no edges intersect except at the vertices. In graph (b) it appears
that we have a nonplanar graph; the edges {x, z} and {w, y} overlap at a point other than a
vertex. However, we can redraw this graph as shown in part (c) of the figure. Consequently,
K is planar.
a
Ww x Ww x
He EN
b Cc 2 y Zz y
(a) (b) (¢)
Figure 11.47
Just as K, is planar, so are the graphs K,, K2, and K3.
EXAMPLE 11.16
An attempt to embed K‘s in the plane is shown in Fig. 11.48. If Ks were planar, then any
embedding would have to contain the pentagon in part (a) of the figure. Since a complete
graph contains an edge for every pair of distinct vertices, we add edge {a, c} as shown in
part (b). This edge is contained entirely within the interior of the pentagon in part (a). (We
could have drawn the edge in the exterior region determined by the pentagon. The reader
will be asked in the exercises to show that the same conclusion arises in this case.) Moving
11.4 Planar Graphs 541
(c)
Figure 11.48
to part (c), we add in the edges {a, d}, {c, e}, and {b, e}. Now we consider the vertices b and
d. We need the edge {b, d} in order to have Ks. Vertex d is inside the region formed by the
cycle edges {a, c}, {c, e}, and {e, a}, whereas b is outside the region. Thus in drawing the
edge {b, d}, we must intersect one of the existing edges at least once, as shown by the dotted
edges in part (d). Consequently, Ks is nonplanar. (Since this proof appeals to a diagram, it
definitely lacks rigor. However, later in the section we shall prove that Ks is nonplanar by
another method.)
Before we can characterize all nonplanar graphs we need to examine another class of
graphs.
Definition 11.18 A graph G = (V, E) is called bipartite if V = V, U V2 with V; M V2 = @, and every edge
of G is of the form {a, b} with a € V, and b € V2. If each vertex in Vj is joined with every
vertex in V2, we have a complete bipartite graph. In this case, if |V|| =m, |V2| =n, the
graph is denoted by Ky,n.
Figure 11.49 indicates how we may partition the vertices of the hypercubes Q), Q2, Q3 to
EXAMPLE 11.17
demonstrate that these graphs are bipartite. In general, for each n > |, partition the vertices
of Q, as V,; U V2, where V; consists of all vertices whose binary labels have an even number
of I’s, while V2 consists of those whose binary labels have an odd number of 1’s. Could
there exist an edge {x, y} in Q, where x, y € V,? Recall that edges in Q, connect vertices
that differ in exactly one of the » positions in their binary labels. Suppose that the binary
labels of x, y differ only in position i, for some ! <i <n. Then the total number of !’s
in the binary labels for x, y is 2 - [the number of I|’s in x (or y) in all positions other than
position 7] + 1, an odd total. But with x, y € V), their binary labels each contain an even
number of |’s —so the total number of 1’s in these binary labels is even! This contradiction
tells us that there is no edge {x, y} in Q, where x, y € V,. Asimilar argument can be given
542 Chapter 11 An Introduction to Graph Theory
to rule out the possibility of an edge {u, w}, where u. w € V2. Consequently, Q,, is bipartite
foralln > 1.
011 114
1 01 11
010 110
0 00 10 000 100
001 101
V,={0} | V, = {00, 11} V, = {000, 011, 101, 110}
V>={1} | V, = {01, 10} V> = {001, 010, 100, 111}
(Q)) (Q2) (Q3)
Figure 11.49
Figure 11.50 shows two bipartite graphs. The graph in part (a) satisfies the definition
for V; = {a, b} and V2 = {c, d, e}. If we add the edges {b, d} and {b, c}, the result is
the complete bipartite graph K23, which is planar. Graph (b) of the figure is K3,3. Let
V, = (41, ho, hg} and V2 = {u1, u2, 43}, and interpret V; as a set of houses and V> as a set
of utilities. Then K3 3 is called the utility graph. Can we hook up each of the houses with
each of the utilities and avoid having overlapping utility lines? In Fig. {1.50(b) it appears
that this is not possible and that K3 3 is nonplanar. (Once again we deduce the nonplanarity
of a graph from a diagram. However, we shall verify that K33 is nonplanar by another
method, later in Example !1.21 of this section.)
c hy h3
a
d
b
Uy
e
(a) (b)
Figure 11.50
We shall see that when we are dealing with nonplanar graphs, either Ks or K3.3 will be
the source of the problem. Before stating the general result, however, we need to develop
one final new idea.
Definition 11.19 Let G = (V, E) bea loop-free undirected graph, where E # @. An elementary subdivision
of G results when an edge e = {u, w} is removed from G and then the edges {u, v}, {v, w}
are added to G — e, where v ¢ V.
The loop-free undirected graphs G, = (V,, E,) and G2 = (V>, E2) are called homeo-
morphic if they are isomorphic or if they can both be obtained from the same loop-free
undirected graph H by a sequence of elementary subdivisions.
11.4 Planar Graphs 543
a) Let G = (V, E) be a loop-free undirected graph with |£| > [. If G’ is obtained from
EXAMPLE 11.18
G by an elementary subdivision, then the graph G’ = (V’, E’) satisfies |V’| = |V| + 1
and |E’| = |E| + 1.
b) Consider the graphs G, G,, G2, and G3 in Fig. 11.51. Here G, is obtained from G
by means of one elementary subdivision: Delete edge {a, b} from G and then add
the edges {a, w} and {w, b}. The graph G2 is obtained from G by two elementary
subdivisions. Hence G, and G2 are homeomorphic. Also, G3 can be obtained from G
by four elementary subdivisions, so G3 is homeomorphic to both G, and G2.
(G) (G;) (Gp) (G3)
a b a b a b a b
y x y x
Zz
e d e d e d e d
(a) (b) (c) (d)
Figure 11.51
However, we cannot obtain G, from G2 (or G2 from G) by a sequence of elemen-
tary subdivisions. Furthermore, the graph G3 can be obtained from either G; or Gz
by a sequence of elementary subdivisions: six (such sequences of three elementary
subdivisions) for G; and two for G2. But neither G; nor G2 can be obtained from G3
by a sequence of elementary subdivisions.
One may think of homeomorphic graphs as being isomorphic except, possibly, for ver-
tices of degree 2. In particular, if two graphs are homeomorphic, they are either both planar
or they are both nonplanar.
These preliminaries lead us to the following result.
THEOREM 11.5 Kuratowski’s Theorem. A graph is nonplanar if and only if it contains a subgraph that is
homeomorphic to either Ks or K3 3,
Proof: (This theorem was first proved by the Polish mathematician Kasimir Kuratowski in
1930.) If a graph G has a subgraph homeomorphic to either Ks or K33, it is clear that G
is nonplanar. The converse of this theorem, however, is much more difficult to prove. (A
proof can be found in Chapter 8 of C. L. Liu [23] or Chapter 6 of D. B. West [32].)
We demonstrate the use of Kuratowski’s Theorem in the following example.
a) Figure {1.52(a) is a familiar graph called the Petersen graph. Part (b) of the figure
EXAMPLE 11.19
provides a subgraph of the Petersen graph that is homeomorphic to K3.3. (Figure 11.53
shows how the subgraph is obtained from K33 by a sequence of four elementary
subdivisions.) Hence the Petersen graph is nonplanar.
b) In part (a) of Fig. 11.54 we find the 3-regular graph G, which is isomorphic to the 3-
dimensional hypercube Q3. The 4-regular complement of G is shown in Fig. 11.54(b),
where the edges {a, g} and {d, f} suggest that G may be nonplanar. Figure 11.54(c)
544 Chapter 11 An Introduction to Graph Theory
depicts a subgraph H of G that is homeomorphic to Ks, so by Kuratowski’s Theorem
it follows that G is nonplanar.
a J
ad
. KS .
d
an C g
(a) (b)
Figure 11.52
b b
g
()) (i!) (iu)
J j
d d
Cc Cc
b b
g g
(Iv) (v)
Figure 11.53
a b a c
(a) G(Q3) (b) G(Q3) (c) H
Figure 11.54
When a graph or multigraph is planar and connected, we find the following relation,
which was discovered by Euler. For this relation we need to be able to count the number
of regions determined by a planar connected graph or multigraph — the number (of these
regions) being defined only when we have a planar embedding of the graph. For instance,
the planar embedding of K, in part (a) of Fig. 11.55 demonstrates how this depiction of K4
determines four regions in the plane: three of finite area— namely, R1, R2, and R3—and
11.4 Planar Graphs 545
the infinite region Ry. When we look at Fig. 11.55(b) we might think that here K4 determines
five regions, but this depiction does nor present a planar embedding of K4. So the result in
Fig. 11.55(a) is the only one we actually want to deal with here.
a b a b
Ry R3
Ro
d C d c
(a) Ry (b)
Figure 11.55
THEOREM 11.6 Let G = (V. E) beaconnected planar graph or multigraph with |V| = v and |£| = e. Letr
be the number of regions in the plane determined by a planar embedding (or, depiction) of
G; one of these regions has infinite area and is called the infinite region. Thenv —e +r = 2.
Proof: The proof is by induction on e. Ife = Oor |, then G is isomorphic to one of the graphs in
Fig. 11.56. The graph in part (a) has v = 1,e = O,andr = l;so,u-—e+r=1-—-0+1 =2.
For graph (b), v = 1, e = 1, andr = 2. Graph (c) has v = 2,e = |, andr = 1. In both cases,
v—e+r=2.
(a) (b) © ()
Figure 11.56
Now let k € N and assume that the result is true for every connected planar graph or
multigraph with e edges, where 0 < e <k. If G = (V, E) is a connected planar graph or
multigraph with v vertices, r regions, and e = k + 1 edges, let a, b € V with {a, b} € E.
Consider the subgraph H of G obtained by deleting the edge {a, b} from G. (If G is a
multigraph and {a, b} is one of a set of edges between a and b, then we remove it only
once.) Consequently, we may write H = G — {a, b} or G = H +{a, b}. We consider the
following two cases, depending on whether # is connected or disconnected.
Case 1: The results in parts (a), (b), (c), and (d) of Fig. 11.57 show us how a graph G may be
obtained from a connected graph H when the (new) loop {a, a} is drawn as in parts (a) and
(b) or when the (new) edge {a, b} joins two distinct vertices in H as in parts (c) and (d). In all
of these situations, H has v vertices, k edges, and r — | regions because one of the regions
for H is split into two regions for G. The induction hypothesis applied to graph Htells us
that v —k + (r — 1) = 2, and from this it follows that 2 =v — (k+1)+r=v-—e4r.
So Euler’s Theorem is true for G in this case.
546 Chapter 11 An Introduction to Graph Theory
Figure 11.57
Case 2: Now we consider the case where G — {a, b} = H is a disconnected graph [as
demonstrated in Fig. 11.57(e) and (f)]. Here H has v vertices, k edges, and r regions. Also,
H has two components H, and M2, where H; has v; vertices, e; edges, and r; regions,
for i = 1, 2. [Part (e) of Fig. 11.57 indicates that one component could consist of just
an isolated vertex.] Furthermore, vj + v2 = v, e} te2 =k (=e-—1), andr; +r =r4+1
because each of H, and H> determines an infinite region. When we apply the induction
hypothesis to each of H, and H> we learn that
vy) —e; tr, =2 and w-—e.t+tr =2.
Consequently, (v; + v2) — (e; + e2) + (4) ro) =v —(e-1)+ (4+ 1) = 4, and from
this it follows that v — e +r = 2, thus establishing Euler’s Theorem for G in this case.
The following corollary for Theorem 11.6 provides two inequalities relating the number
of edges in a loop-free connected planar graph G with (1) the number of regions determined
by a planar embedding of G; and (2) the number of vertices in G. Before we examine this
corollary, however, let us look at the following helpful idea. For each region RF in a planar
embedding of a (planar) graph or multigraph, the degree of R, denoted deg(R), is the number
of edges traversed in a (shortest) closed walk about (the edges in) the boundary of R. If
G = (V, E£) is the graph of Fig. 11.58(a), then this planar embedding of G has four regions
where
deg(R;) =5, deg(R2) = 3, deg(R3) = 3, deg(R4) = 7.
[Here deg(R4) = 7, as determined by the closed walk:a > b> g ~>h>g—->f—>d-
a.] Part (b) of the figure shows a second planar embedding of G — again with four regions —
and here
deg(Rs) = 4, deg( Rs) = 3, deg(R7) =5, deg(Rg) = 6.
[The closed walk b > g ~ h-> g > f — b gives us deg(R7) = 5.)]
We see that yt deg(R;) = 18 = )°8_, deg(R;) = 2-9 = 2|E|. This is true in general
because each edge of the planar embedding is either part of the boundary of two regions
[like {b, c} in parts (a) and (b)] or occurs twice in the closed walk about the edges in the
boundary for one region [like {g, 4} in parts (a) and (b)].
11.4. Planar Graphs 547
C
a b a Rg b
R; Ra Re
Cc Re A
R, R3 g
R, g
A
d f d f
(a) (b)
Figure 11.58
Now let us consider the following.
COROLLARY 11.3 Let G = (V, E) be a loop-free connected planar graph with |V| = v, |E| =e > 2, andr
regions. Then 3r < 2e and e < 3u — 6.
Proof: Since G is loop-free and is not a multigraph, the boundary of each region (includ-
ing the infinite region) contains at least three edges
— hence, each region has degree > 3.
Consequently, 2e = 2|E| = the sum of the degrees of the r regions determined by G and
2e>3r. From Euler’s Theorem, 2=v—e+r<v—e+4 (2/3)e =v —(1/3)e, so
6 <3u —e, ore <3u — 6.
We now consider what this corollary does and does not imply. If G = (V. E) is a loop-
free connected graph with |E£| > 2, then if e > 3v — 6, it follows that G is not planar.
However, if e < 3v — 6, we cannot conclude that G is planar.
The graph Ks is loop-free and connected with ten edges and five vertices. Consequently,
i EXAMPLE 11.20 3v —6 = 15 —-6=9 < 10 =e. Therefore, by Corollary 11.3, we find that Ks is nonplanar.
| EXAMPLE 11.21 The graph K3 3 is loop-free and connected with nine edges and six vertices. Here 3v — 6 =
18 —6 = 12 >9 =e. It would be a mistake to conclude from this that K3.3 is planar. It
would be the mistake of arguing by the converse.
However, K33 is nonplanar. If K3 3 were planar, then since each region in the graph is
bounded by at least four edges, we have 4r < 2e. (We founda similar situation in the proof of
Corollary 11.3.) From Euler’s Theorem, v — e +r = 2,0orr =e -—-v+2=9-64+2=5,
so 20 = 4r < 2e = 18. From this contradiction we have K3,.3 being nonplanar.
We use Euler’s Theorem to characterize the Platonic solids. [For these solids all faces are
EXAMPLE 11.22
congruent and all (interior) solid angles are equal.] In Fig. 11.59 we have two of these
solids. Part (a) of the figure shows the regular tetrahedron, which has four faces, each an
equilateral triangle. Concentrating on the edges of the tetrahedron, we focus on its underlying
framework. As we view this framework from a point directly above the center of one of the
faces, we picture the planar representation in part (b). This planar graph determines four
regions (corresponding to the four faces); three regions meet at each of the four vertices.
Part (c) of the figure provides another Platonic solid, the cube. Its associated planar graph
is given in part (d). In this graph there are six regions with three regions meeting at each
vertex.
548 Chapter 11 An Introduction to Graph Theory
(a) (b) (©) (d)
Figure 11.59
On the basis of our observations for the regular tetrahedron and the cube, we shall
determine the other Platonic solids by means of their associated planar graphs. In these
graphs G = (V, E) wehavev = |V|;e = |E|;r =the number of planar regions determined
by G; m = the number of edges in the boundary of each region; and n = the number of
regions that meet at each vertex. Thus the constants m, n > 3. Since each edge is used in the
boundary of two regions and there are r regions, each with m edges, it follows that 2e = mr.
Counting the endpoints of the edges, we get 2e. But all these endpoints can also be counted
by considering what happens at each vertex. Since n regions meet at each vertex, n edges
meet there, so there are n endpoints of edges to count at each of the v vertices. This totals
nv endpoints of edges, so 2e = nv. From Euler’s Theorem we have
2e 2e 2m —mn+2n
0<2=v-etr aoe Bae ( MOE)
n m mn
With e, m,n > 0, we find that
2m —mn+2n>0=>
mn — 2m —2n <0
=> mn —2m —2n+4<4=>
(m —2)(n — 2) < 4.
Since m,n > 3, we have (m — 2), (n — 2) € Z*, and there are only five cases to consider:
1) (m — 2) = (n-2) =|1jm=n=3 (The regular tetrahedron)
2) (m — 2) = 2,(n —2) = 1;m =4,n =3 (The cube)
3) (m — 2) = 1, (n — 2) = 2;m =3,n=4 (The octahedron)
4) (m — 2) = 3, (n-—2) =1;m=5,n=3 (The dodecahedron)
5) (m — 2) = 1, (n —2) =3;m=3,n=5_ (The icosahedron)
The planar graphs for cases 3-5 are shown in Fig. 11.60.
Octahedron Dodecahedron Icosahedron
Figure 11.60
11.4 Planar Graphs 549
The last idea we shall discuss for planar graphs is the notion of a dual graph. This
concept is also valid for planar graphs with loops and for planar multigraphs. To construct
a dual (relative to a particular embedding) for a planar graph or multigraph G with V =
{a, b, c,d, e, f}, place a point (vertex) inside each region, including the infinite region,
determined by the graph, as in Fig. 11.61(a). For each edge shared by two regions, draw
an edge connecting the vertices inside these regions. For an edge that is traversed twice in
the closed walk about the edges of one region, draw a loop at the vertex for this region.
In Fig. 11.61(b), G4 is a dual for the graph G = (V, E). From this example we make the
following observations:
1) An edge in G corresponds with an edge in G“, and conversely.
2) A vertex of degree 2 in G yields a pair of edges in G? that connect the same two
vertices. Hence G@ may be a multigraph. (Here vertex e provides the edges {a, e},
{e, f} in G that brought about the two edges connecting v and z in G“.)
3) Given a loop in G, if the interior of the (finite area) region determined by the loop
contains no other vertex or edge of G, then the loop yields a pendant vertex in G?.
(It is also true that a pendant vertex in G yields a loop in G4.)
4) The degree of a vertex in G@ is the number of edges in the boundary of the closed
walk about the region in G that contains that vertex.
(a) G=(V, EF) (b) Gd
Figure 11.61
(Why is G@ called a dual of G instead of the dual of G? The Section Exercises will show
that it is possible to have isomorphic graphs G,; and G> with respective duals G¢, G4 that
are not isomorphic.)
In order to examine further the relationship between a graph G and a dual G? of G, we
introduce the following idea. [Here we recall (from Definition 11.5) that «(G) counts the
number of components of G.]
Definition 11.20 Let G = (V, £) be an undirected graph or multigraph. A subset £’ of EF is called a cut-set
of G if by removing the edges (but not the vertices) in E’ from G, we have k(G) < «(G’),
where G’ = (V, E — E'); but when we remove (from £) any proper subset E” of E’, we
have «(G) = x(G”"), for G” =(V, E — EB”).
550 Chapter 11 An Introduction to Graph Theory
For a given connected graph, a cut-set is a minimal disconnecting set of edges. In the graph
EXAMPLE 11.23 in Fig. 11.62(a), note that each of the sets {{a, b}, {a, c}}, {{a. b}, {c, d}}, (fe, h}, Uf A},
{g. h}}, and {{d. f}} is a cut-set. For the graph in part (b) of the figure, the edge set {{n, p},
{r, p}, {r, s}} is a cut-set. Note that the edges in this cut-set are not all incident to some
single vertex. Here the cut-set separates the vertices m, n, r from the vertices p, s, t. The
edge set {{s, £}} is also a cut-set for this graph — the removal of the edge {s, r} from this
connected graph results in a subgraph with two components, one of which is the isolated
vertex f.
Figure 11.62
Whenever a cut-set for a connected graph consists of only one edge, that edge is called
a bridge for the graph. For the graph in Fig. 11.62(a), the edge {d, f} is the only bridge;
the edge {s, t} is the only bridge in part (b) of the figure.
We return now to the graphs in Fig. 11.61, redrawing them as shown in Fig. [1.63 in
order to emphasize the correspondence between their edges.
Figure 11.63
Here the edges in G are labeled 1, 2, ... , 10. The numbering scheme for G¢ is obtained
as follows: The edge labeled 4*, for example, connects the vertices w and z in G?. We drew
this edge because edge 4 in G was a common edge of the regions containing these vertices.
Likewise, edge 7 is common to the region containing x and the infinite region containing
v. Hence we label the edge in G¢ that connects x and v with 7*.
In graph G the set of edges labeled 6, 7, 8 constitutes a cycle. What about the edges
labeled 6*, 7*, 8* in G4? If they are removed from G4, then vertex x becomes isolated
and G¢ is disconnected. Since we cannot disconnect G4 by removing any proper subset
11.4 Planar Graphs 551
of {6*, 7*, 8*}, these edges form a cut-set in G“. In similar fashion, edges 2, 4, 10 forma
cut-set in G, whereas in G4 the edges 2*, 4*, 10* yield a cycle.
We also have the two-edge cut-set {3, 10} in G, and we find that the edges 3*, 10* provide
a two-edge circuit in G?. Another observation: The one-edge cut set {1*} in G4 comes about
from edge 1, a loop in G.
In general, there is a one-to-one correspondence between the following sets of edges in
a planar graph G and a dual G4 of G.
1) Cycles (cut-sets) of n (> 3) edges in G correspond with cut-sets (cycles) of n edges
in G4,
2) Aloop in G corresponds with a one-edge cut-set in G%.
3) A one-edge cut-set in G corresponds with a loop in G?.
4) Atwo-edge cut-set in G corresponds with a two-edge circuit in G?.
5) If G is a planar multigraph, then each two-edge circuit in G determines a two-edge
cut-set of G4.
All these theoretical observations are interesting, but let us stop here and see how we
might apply the idea of a dual.
If we consider the five finite regions in Fig. 11.64(a) as countries on a map, and we construct
EXAMPLE 11.24
the subgraph (because we do not use the infinite region) of a dual as shown in part (b), then
we find the following relationship.
Suppose we are confronted with the “mapmaker’s problem” whereby we want to color
the five regions of the map in part (a) so that two countries that share a common border are
colored with different colors. This type of coloring can be translated into the dual notion of
coloring the vertices in part (b) so that adjacent vertices are colored with different colors.
(Such coloring problems will be examined further in Section 11.6.)
(OR)
(a) (b)
Figure 11.64
The final result for this section provides us with an application for an electrical network.
This material is based on Example 8.6 on pp. 227-230 of the text by C. L. Liu [23].
In Fig. 11.65 we see an electrical network with nine contacts (switches) that control the
EXAMPLE 11.25 | excitation of a light. We want to construct a dual network where a second light will go on
(off) whenever the light in our given network is off (on).
The contacts (switches) are of two types: normally open (as shown in Fig. 11.65) and
normally closed. We use a and a’ as in Fig. 11.66 to represent the normally open and
normally closed contacts, respectively.
552 Chapter 11 An Introduction to Graph Theory
TAPS
Figure 11.65
| [A]
Figure 11.66
In Fig. 11.67(a) a one-terminal-pair-graph represents the network in Fig. 11.65. Here
the
the special pair of vertices is labeled | and 2. These vertices are called the terminals of
graph. Also each edge is labeled according to its corresponding contact in Fig. 11.65.
(c)
Figure 11.67
A one-terminal-pair-graph G is called a planar-one-terminal-pair-graph if G is planar,
and the resulting graph is also planar when an edge connecting the terminals is added to G.
Figure 11.67(b) shows this situation. Constructing a dual of part (b), we obtain the graph in
part (c) of the figure. Removal of the dotted edge results in the terminals |*, 2* for this dual,
which is a one-terminal-pair-graph. This graph provides the dual network in Fig. 11.67(d).
We make two observations in closing.
1) When the contacts at a, b, c are closed in the original network (Fig. 11.65), the light
is on. In Fig. {1.67(b) the edges a, b, c, j form a cycle that includes the terminals.
11.4 Planar Graphs 553
In part (c) of the figure, the edges a*, b*, c*, j* form a cut-set disconnecting the
terminals 1*, 2*. Finally, with a’, b’, c’ open in part (d) of the figure, no current gets
past the first level of contacts (switches) and the light is off.
2) In like manner, the edges c, d, e, g, j form a cut-set that separates the terminals in
Fig. 11.67(b). (When the contacts at c, d, e, g are open in Fig. 11.65, the light is off.)
Figure 11.67(c) shows how c*, d*, e*, g*, j/* form a cycle that includes 1*, 2*. If c’,
d', e’, g’ are closed in part (d), current flows through the dual network and the light
is On.
9. How many paths of longest length are there in each of the
following graphs? (Remember thata path suchas v) > v2 > v3
is considered to be the same as the path v3; — v2 > v}.)
1. Verify that the conclusion in Example 11.16 is unchanged
if Fig. 11.48(b) has edge {a, c} drawn in the exterior of the a) Kia b) K3.7 ce) K7.12
pentagon. d) K,,,, where m,n €Z* withm <n,
2. Show that when any edge is removed from Ks, the resulting 10. Cana bipartite graph contain acycle of odd length? Explain.
subgraph is planar. Is this true for the graph K33? 11. Let G = (V, E) bea loop-free connected graph with |V| =
v. If |E| > (v/2)*, prove that G cannot be bipartite.
3. a) How many vertices and how many edges are there in
the complete bipartite graphs K4.7, K7,;,, and K,,.,, where 12. a) Find all the nonisomorphic complete bipartite graphs
m,n, € Zt? G = (V, E), where |V| = 6.
b) If the graph K,,12 has 72 edges, what is m? b) How many nonisomorphic complete bipartite graphs
G = (V, E) satisfy |V| =n > 2?
4. Prove that any subgraph of a bipartite graph is bipartite.
13. a) Let X = {1, 2, 3, 4, 5}. Construct the loop-free undi-
5. For each graph in Fig. 11.68 determine whether or not the rected graph G = (V, E) as follows:
graph is bipartite.
e (V): Let each two-element subset of X represent a ver-
6. Let n € Z* with n > 4. How many subgraphs of K, are tex in G.
isomorphic to the complete bipartite graph K, 3? e (F): If v;, v2 € V correspond to subsets {a, b} and
{c, d}, respectively, of X, then draw the edge {v), v2}
7. Let m,n € Z* with m >n > 2. (a) Determine how many
in G if {a, b} MN {e, d} = G.
distinct cycles of length 4 there are in K,,,. (b) How many
different paths of length 2 are there in Km.,? (c) How many b) To what graph is G isomorphic?
different paths of length 3 are there in K,, ,? 14. Determine which of the graphs in Fig. 11.69 are planar. If
a graph is planar, redraw it with no edges overlapping. If it is
8. What is the length of a longest path in each of the following
nonplanar, find a subgraph homeomorphic to either Ks or K3,3.
graphs?
15. Let m,n € Z* with m <n. Under what condition(s) on
a) K K K
Ki4 b) K37 ©) Koi m, nwillevery edge in K,,., be inexactly one of two isomorphic
d) K,,.., where m,n € Z* withm <n. subgraphs of Kim»?
a b
a b
Cc d
c Od
f e
e f
g A
g A
(a) (G) (c) (G’’)
Figure 11.68
554 Chapter 11 An Introduction to Graph Theory
a 5 c d a
pf
e¢
e Cc \
g ho | C
(a) (6) (c)
a
a a b b
f b (A,
> °
e C ax
d u VW xX Y 2
g r
(d) (e) (f)
Figure 11.69
16. Prove that the Petersen graph is isomorphic to the graph in gions. If, for some planar embedding of G, each region has at
Zh,
Fig. 11.70 least five edges in its boundary, prove that |V| > 82.
g r 19. Let G = (V, E) be a loop-free connected 4-regular planar
graph. If |E| = 16, how many regions are there in a planar de-
piction of G?
20. Suppose that G = (V, E) is a loop-free planar graph with
|V| = v, |E| = e, and«(G) = the number of components of G.
(a) State and prove an extension of Euler’s Theorem for such
a graph. (b) Prove that Corollary 11.3 remains valid if G is
y z loop-free and planar but not connected.
Figure 11.70
21. Prove that every loop-free connected planar graph has a
17. Determine the number of vertices, the number of edges, and vertex uv with deg(v) < 6.
the number of regions for each of the planar graphs in Fig. 11.71.
22. a) Let G = (V, E) be a loop-free connected graph with
Then show that your answers satisfy Euler’s Theorem for con-
|V| > 11. Prove that either G or its complement G must be
nected planar graphs.
nonplanar.
b) The result in part (a) is actually true for |V| > 9, but the
proof for |V| = 9, 10, is much harder. Find a counterexam-
ple to part (a) for |V| = 8.
23. a) Letk € Z*,k > 3. If G = (V, E) is aconnected planar
graph with |V| = v, |Z| =e, and each cycle of length at
least k, prove that e < (-*5) (v — 2).
b) What is the minimal cycle length in K3.3?
c) Use parts (a) and (b) to conclude that 3,3 is nonplanar.
(a) (b)
d) Use part (a) to prove that the Petersen graph is non-
Figure 11.71
planar.
18. Let G =(V, E) be an undirected connected loop-free 24. a) Find a dual graph for each of the two planar graphs and
graph. Suppose further that G is planar and determines 53 re- the one planar multigraph in Fig. 11.72.
11.4 Planar Graphs 555
a b
e
d
f g
A c d y Zz
(a)
(a) (b)
t u
Figure 11.73
Vv w \ \ 1) In Fig. 11.74 we split a vertex, namely r, of G and
m YS obtain the graph H, which is disconnected.
y 2
2) In Fig. 11.75 we obtain graph (d) from graph (a) by
i) first splitting the two distinct vertices j and
(b) (c)
q — disconnecting the graph,
Figure 11.72 ii) thenreflecting one subgraph about the horizon-
tal axis, and
iii) then identifying vertex j(q) in one subgraph
b) Does the dual for the multigraph in part (c) have any with vertex g(j) in the other subgraph.
pendant vertices? If not, does this contradict the third ob-
servation made prior to Definition 11.20? Prove that the dual graphs obtained in part (c) are 2-
isomorphic.
25. a) Find duals for the planar graphs that correspond with
the five Platonic solids. p s p s
b) Find the dual of the graph W,,, the wheel with n spokes
(as defined in Exercise 14 of Section 11.1). —>
26. a) Show that the graphs in Fig. 11.73 are isomorphic.
b) Draw a dual for each graph. gq r t q ror t
c) Show that the duals obtained in part (b) are not isomor- (G) (H)
phic. Figure 11.74
d) Two graphs G and H are called 2-isomorphic if one can
be obtained from the other by applying either or both of the e) For the cut-set {{a, b}, {c, b}, {d, b}} in part (a) of
following procedures a finite number of times. Fig. 11.73, find the corresponding cycle in its dual. In the
joj
® 4 >
(i) \\ —>
(ii)
—
e . «
np q rs n Pq g r S
(a) (b)
/ r Ss ! / r Ss
4 > ? »
(iii)
—
r > o—¢ 6
nm p @q {jf k om nm ep q k om
(c) (d)
Figure 11.75
556 Chapter 11 An Introduction to Graph Theory
dual of the graph in Fig. 11.73(b), find the cut-set that cor-
responds with the cycle {w, z}, {z, x}. {x, y}, {y, w} in the Aw,
d
given graph.
b e ||
27. Find the dual network for the electrical network shown in
Fig. 11.76.
28. Let G = (V, E) be a loop-free connected planar graph. If T EWTN
G is isomorphic to its dual and |V| = 2, what is | E|? A
29. Let G,, G2 be two loop-free connected undirected graphs.
If Gy, G2 are homeomorphic, prove that (a) G;, G2 have the Figure 11.76
same number of vertices of odd degree; (b) G; has an Euler
trail if and only if G, has an Euler trail; and (c) G, has an Euler
circuit if and only if G2 has an Euler circuit.
11.5
Hamilton Paths and Cycles
In 1859 the Irish mathematician Sir William Rowan Hamilton (1805-1865) developed a
game that he sold to a Dublin toy manufacturer. The game consisted of a wooden regular
dodecahedron with the 20 corner points (vertices) labeled with the names of prominent
cities. The objective of the game was to find a cycle along the edges of the solid so that each
city was on the cycle (exactly once). Figure 11.77 is the planar graph for this Platonic solid:
such a cycle is designated by the darkened edges. This illustration leads us to the following
definition.
Figure 11.77
Definition 11.21 If G = (V, E) is a graph or multigraph with |V| > 3, we say that G has a Hamilton cycle
if there is a cycle in G that contains every vertex in V. A Hamilton path is a path (and not
a cycle) in G that contains each vertex.
Given a graph with a Hamilton cycle, we find that the deletion of any edge in the cycle
results in a Hamilton path. It is possible, however, for a graph to have a Hamilton path
without having a Hamilton cycle.
It may seem that the existence of a Hamilton cycle (path) and the existence of an Euler
circuit (trail) for a graph are similar problems. The Hamilton cycle (path) is designed to
visit each vertex in a graph only once; the Euler circuit (trail) traverses the graph so that
each edge is traveled exactly once. Unfortunately, there is no helpful connection between
the two ideas, and unlike the situation for Euler circuits (trails), there do not exist necessary
11.5 Hamilton Paths and Cycles 557
and sufficient conditions on a graph G that guarantee the existence of a Hamilton cycle
(path). If a graph has a Hamilton cycle, then it will at least be connected. Many theorems
exist that establish either necessary or sufficient conditions for a connected graph to have a
Hamilton cycle or path. We shall investigate several of these results later. When confronted
with particular graphs, however, we shall often resort to trial and error, with a few helpful
observations.
Referring back to the hypercubes in Fig. 11.35 we find in Q> the cycle
| EXAMPLE 11.26
00 -—-> 10 — 1! —> 01 — 00
and in Q3 the cycle
000 —— 100 —> 110 —> 010 —> 011 —> 111 —> 101 —> 001 —~> 000.
Hence Q>2 and Q3 have Hamilton cycles (and paths). In fact, for all n > 2, we find that Q,
has a Hamilton cycle. (The reader is asked to establish this in the Section Exercises.) [Note,
in addition, that the listings: 00, 10, 11, 01 and 000, 100, 110, 010, 011, 111, 101, 001 are
examples of Gray codes (which were introduced in Example 3.9).]
If G is the graph in Fig. 11.78, the edges {a, b}, {b. c}, {c. f}, {f. ef, fe. d}, {d, g}, {g. A},
EXAMPLE 11.27
{h, i} yield a Hamilton path for G. But does G have a Hamilton cycle?
6Sy
ran)
ea
oD
g bh i
Figure 11.78
Since G has nine vertices, if there is a Hamilton cycle in G it must contain nine edges.
Let us start at vertex b and try to build a Hamilton cycle. Because of the symmetry in the
graph, it doesn’t matter whether we go from b to c or to a. We’ll go to c. At c we can go
either to f or to i. Using symmetry again, we go to f. Then we delete edge {c, i} from
further consideration because we cannot return to vertex c. In order to include vertex i in
our cycle, we must now go from f toi (to h to g). With edges {c, f} and {f, i} in the
cycle, we cannot have edge {e, f} in the cycle. [Otherwise, in the cycle we would have
deg(f) > 2.] But then once we get to e we are stuck. Hence there is no Hamilton cycle for
the graph.
Example 11.27 indicates a few helpful hints for trying to find a Hamilton cycle in a graph
G =(V, E).
1) If G has a Hamilton cycle, then for all v € V, deg(v) > 2.
2) If a € V and deg(a) = 2, then the two edges incident with vertex a must appear in
every Hamilton cycle for G.
558 Chapter 11 An Introduction to Graph Theory
3) If ae V and deg(a) > 2, then as we try to build a Hamilton cycle, once we pass
through vertex a, any unused edges incident with a are deleted from further consid-
eration.
4) In building a Hamilton cycle for G, we cannot obtain a cycle for a subgraph of G
unless it contains all the vertices of G.
Our next example provides an interesting technique for showing that a special type of
graph has no Hamilton path.
In Fig. 11.79(a) we have a connected graph G, and we wish to know whether G contains
EXAMPLE 11.28
a Hamilton path. Part (b) of the figure provides the same graph with a set of labels x, y.
This labeling is accomplished as follows: First we label vertex a with the letter x. Those
vertices adjacent to a (namely, b, c, and d) are then labeled with the letter y. Then we label
the unlabeled vertices adjacent to b, c, or d with x. This results in the label x on the vertices
é, g, and i. Finally, we label the unlabeled vertices adjacent to e, g, or i with the label y. At
this point, all the vertices in G are labeled. Now, since | V| = 10, if G is to have a Hamilton
path there must be an alternating sequence of five x’s and five y’s. Only four vertices are
labeled with x, so this is impossible. Hence G has no Hamilton path (or cycle).
) J
Figure 11.79
But why does this argument work here? In part (c) of Fig. 11.79 we have redrawn the
given graph, and we see that it is bipartite. From Exercise !0 in the previous section we
know that a bipartite graph cannot have a cycle of odd length. It is also true that if a graph
has no cycle of odd length, then it is bipartite. (The proof is requested of the reader in
Exercise 9 of this section.) Consequently, whenever a connected graph has no odd cycle
(and is bipartite), the method described above may be helpful in determining when the graph
does not have a Hamilton path. (Exercise 10 in this section examines this idea further.)
Our next example provides an application that calls for Hamilton cycles in a complete
graph.
At Professor Alfred’s science camp, 17 students have lunch together each day at a circular
EXAMPLE 11.29
table. They are trying to get to know one another better, so they make an effort to sit next to
two different colleagues each afternoon. For how many afternoons can they do this? How
can they arrange themselves on these occasions?
To solve this problem we consider the graph K,,, where n > 3 and is odd. This graph
has n vertices (one for each student) and (5) = n(n — [)/2 edges. A Hamilton cycle in K,
11.5 Hamilton Paths and Cycles 559
corresponds to a seating arrangement. Each of these cycles has n edges, so we can have at
most (1/ n)(3) = (n — 1)/2 Hamilton cycles with no two having an edge in common.
Consider the circle in Fig. 11.80 and the subgraph of K,, consisting of the n vertices and
the n edges {I, 2}, {2, 3},..., {n — |, n}, {n, 1}. Keep the vertices on the circumference
fixed and rotate this Hamilton cycle clockwise through the angle [1/(n — 1)](2z). This
gives us the Hamilton cycle (Fig. 11.8!) made up of edges {!, 3}, {3, 5}, {5, 2}, {2, 7},...,
{n,n — 3}, {2 — 3, n — 1}, {n — 1, 1}. This Hamilton cycle has no edge in common with
the first cycle. When n > 7 and we continue to rotate the cycle in Fig. 11.80 in this way
through angles [K/(n — 1)](27), where 2 <k < (n — 3)/2, we obtain a total of (n — 1)/2
Hamilton cycles, no two of which have an edge in common.
Figure 11.80 Figure 11.81
Therefore the 17 students at the science camp can dine for [(17 — 1)/2] = 8 days before
some student will have to sit next to another student for a second time. Using Fig. 11.80
with n = 17, we can obtain eight such possible arrangements.
We turn now to some further results on Hamilton paths and cycles. Our first result was
established in 1934 by L. Redei.
THEOREM 11.7 Let K¥ be a complete directed graph — that is, K;* has n vertices and for each distinct pair
x, y of vertices, exactly one of the edges (x, y) or (y, x) is in K;**. Such a graph (called a
tournament) always contains a (directed) Hamilton path.
Proof: Let m >2 with p,, a path containing the m— 1 edges (v1, v2), (v2, v3), ...,
(Um_—1, Um). If m =n, we're finished. If not, let v be a vertex that doesn’t appear in p,,.
If (v, v;) is an edge in K;*, we can extend p,, by adjoining this edge. If not, then (v1, v)
must be an edge. Now suppose that (v, v2) is in the graph. Then we have the larger path:
(v1, Vv), (UV, U2), (V2, U3). .-., (Um—1, Um). Tf (v, v2) is not an edge in K;*, then (v2, v) must
be. As we continue this process there are only two possibilities: (a) Forsome 1 < k <m — |
the edges (uz. v), (UV, Ug41) are in K* and we replace (vy, vg41) with this pair of edges; or
(b) (Un, v) is in K* and we add this edge to p,,. Either case results in a path p,,4, that
includes m + | vertices and has m edges. This process can be repeated until we have such
a path p, on n vertices.
In a round-robin tournament each player plays every other player exactly once. We want to
EXAMPLE 11.30
somehow rank the players according to the results of the tournament. Since we could have
players a, b, and c where a beats 5 and b beats c, but c beats a, it is not always possible
to have a ranking where a player in a certain position has beaten all of the opponents in
560 Chapter 11 An Introduction to Graph Theory
later positions. Representing the players by vertices, construct a directed graph G on these
vertices by drawing edge (x, y) if x beats y. Then by Theorem 11.7, it is possible to list the
players such that each has beaten the next player on the list.
THEOREM 11.8 Let G = (V, E) be aloop-free graph with |V| =” > 2. If deg(x) + deg(y) => n — 1 forall
x, yeEV,x #y, then G has a Hamilton path.
Proof: First we prove that G is connected. If not, let C;, C2 be two components of G and
let x, y € V with x a vertex in C, and y a vertex in C. Let C; have n; vertices, i = 1, 2.
Then deg(x) <n, — 1, deg(y) <n2 — 1, and deg(x) + deg(y) < (11 +12) —2 <n —-2,
contradicting the condition given in the theorem. Consequently, G is connected.
Now we build a Hamilton path for G. For m > 2, let p», be the path {v), v2}, {v2. v3}.
.. +, {Um—1. Um} Of length m — 1. (We relabel vertices if necessary.) Such a path exists,
because for m = 2 all that is needed is one edge. If v; is adjacent to any vertex v other
than v2, v3, ..., Um, we add the edge {v, v)} to p» to get P41. The same type of pro-
cedure is carried out if vy, is adjacent to a vertex other than vj), v2,..., Um—\. If we
are able to enlarge p,, to p, in this way, we get a Hamilton path. Otherwise the path
Pm. {V1. V2}. .... {Um—1, Um} has v1, UV» adjacent only to vertices in p»,, and m <n. When
this happens we claim that G contains a cycle on these vertices. If v; and u,, are adja-
cent, then the cycle is {v1, v2}, {v2, v3}, .... (Um—-1. Um}, {Um. vi}. If vy and v,», are not
adjacent, then v; is adjacent to a subset S of the vertices in {v2, v3,..., Um —1}. If there
is a vertex v; € S such that v,, is adjacent to v,_;, then we can get the cycle by adding
{vy, v:}, {Up-1. Um} to p», and deleting {v,_), v,} as shown in Fig. 11.82. If not, let |$| =
k <m — |. Then deg(v,) = & and deg(v,,) < (m — 1) — k, and we have the contradiction
deg(v,) + deg(vm,) <m — 1 <n -— 1. Hence there is a cycle connecting vj, v2, .... Um-
vy
(b)
Figure 11.82 Figure 11.83
Now consider a vertex v € V that is not found on this cycle. The graph G is connected,
so there is a path from v to a first vertex v, in the cycle, as shown in Fig. 11.83(a). Removing
the edge {v,_1, v-} (or {v,, v,} ify = £), we get the path (longer than the original p,,) shown
in Fig. 11.83(b). Repeating this process (applied to p,,) for the path in Fig. 11.83(b), we
continue to increase the length of the path until it includes every vertex of G.
11.5 Hamilton Paths and Cycles 561
COROLLARY 11.4 Let G = (V, E) be a loop-free graph with n (> 2) vertices. If deg(v) > (mn — 1)/2 for all
v € V, then G has a Hamilton path.
Proof: The proof is left as an exercise for the reader.
Our last theorem for this section provides a sufficient condition for the existence of a
Hamilton cycle in a loop-free graph. This was first proved by Oystein Ore in 1960.
THEOREM 11.9 Let G = (V, E) bea loop-free undirected graph with |V| =n > 3. Ifdeg(x) + deg(y) >n
for all nonadjacent x, y € V, then G contains a Hamilton cycle.
Proof: Assume that G does not contain a Hamilton cycle. We add edges to G until we arrive
at a subgraph H of K,,, where H has no Hamilton cycle, but, for any edge e (of K,,) not in
H, H + e does have a Hamilton cycle.
Since H # K,,, there are vertices a, b € V, where {a, b} is not an edge of H but H +
{a, b} has a Hamilton cycle C. The graph H has no such cycle, so the edge {a, b} is a part
of cycle C. Let us list the vertices of H (and G) on cycle C as follows:
C= V1) 9 B= 02) 9 03> Ug > > Unt > On
Foreach3 <i <n, ifthe edge {b, v;} is inthe graph H, then we claim that the edge {a, v,_;}
cannot be an edge of H. For if both of these edges are in H, for some 3 <i <n, then we
get the Hamilton cycle
CO PUPOi P PP nn FP On FEU Bi -2F * VGE PVD
for the graph H (which has no Hamilton cycle). Therefore, for each 3 <i <n, at most one
of the edges {b, v;}, {a. v;_,} is in H. Consequently,
deg, (a) + deg, (b) <n,
where deg,,(v) denotes the degree of vertex v in graph H. For all uv € V, deg, (v) >
deg, (v) = deg(v), so we have nonadjacent (in G) vertices a, b, where
deg(a) + deg(b) <n.
This contradicts the hypothesis that deg(x) + deg(y) > n for all nonadjacent x, y € V,
sO we reject our assumption and find that G contains a Hamilton cycle.
Now we shall obtain the following two results from Theorem 11.9. Each will give us a
sufficient condition for a loop-free undirected graph G = (V, E) to have a Hamilton cycle.
The first result is similar to Corollary 11.4 and is concerned with the degree of each vertex
v in V. The second result examines the size of the edge set F.
COROLLARY 11.5 If G = (V, E) is a loop-free undirected graph with |V| = n > 3, and if deg(v) > n/2 for
all v € V, then G has a Hamilton cycle.
Proof: We shall leave the proof of this result for the Section Exercises.
562 Chapter 11 An Introduction to Graph Theory
COROLLARY 11.6 If G = (V, E) is a loop-free undirected graph with |V| = n > 3, and if |E| > (” 2 ') + 2,
then G has a Hamilton cycle.
Proof: Let a, b € V, where {a, b} € FE. [Since a, b are nonadjacent, we want to show that
deg(a) + deg(b) > n.] Remove the following from the graph G: (i) all edges of the form
{a, x}, where x € V; (ii) all edges of the form {y, b}, where y € V; and (iii) the vertices a
and b. Let H = (V’, E’) denote the resulting subgraph. Then |£| = |E’| + deg(a) + deg(b)
because {a, b} ¢ E.
Since |V’| = — 2, H is a subgraph of the complete graph K,_2, so |E’| < ("5’).
Consequently, (” 3 ') +2<|E| = |E’| + deg(a) + deg(b) < ("3°) + deg(a) + deg(b),
and we find that
seat)
+ dee) ("5") 42-("52)
= (;) (n — 1) -—2)4+2—- (;) (n — 2)(n
— 3)
-(5) (n — 2)[( — 1) —- (n — 3)) +2
-(5) (n — 2)(2) +2 = (n-2)42=n.
Therefore it follows from Theorem [1.9 that the given graph G has a Hamilton cycle.
A problem that is related to the search for Hamilton cycles in a graph is the traveling
salesman problem. (An article dealing with this problem was published by Thomas P. Kirk-
man in 1855.) Here a traveling salesperson leaves his or her home and must visit certain
locations before returning. The objective is to find an order in which to visit the locations
that is most.efficient (perhaps in terms of total distance traveled or total cost). The problem
can be modeled with a labeled (edges have distances or costs associated with them) graph
where the most efficient Hamilton cycle is sought.
The references by R. Bellman, K. L. Cooke, and J. A. Lockett [7]; M. Bellmore and
G.L. Nemhauser [8]; E. A. Elsayed [15]; E. A. Elsayed and R. G. Stern [16]; and L. R. Foulds
[17] should prove interesting to the reader who wants to learn more about this important
optimization problem. Also, the text edited by E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy
Kan, and D. B. Shmoys [22] presents 12 papers on various facets of this problem.
Even more on the traveling salesman problem and its applications can be found in the
handbooks edited by M. O. Ball, T. L. Magnanti, C. L. Monma, and G. L. Nemhauser— in
particular, the articles by R. K. Ahuja, T. L. Magnanti, J. B. Orlin, and M. R. Reddy [2],
and by M. Jiinger, G. Reinelt, and G. Rinaldi [21].
3. Find a Hamilton cycle, if one exists, for each of the graphs
oA or multigraphs in Fig. 11.84. If the graph has no Hamilton cycle,
determine whether it has a Hamilton path.
1. Give an example of a connected graph that has (a) Neither
an Euler circuit nor a Hamilton cycle. (b) An Euler circuit but 4. a) Show that the Petersen graph [Fig. 11.52(a)] has no
no Hamilton cycle. (c) A Hamilton cycle but no Euler circuit. Hamilton cycle but that it has a Hamilton path.
(d) Both a Hamilton cycle and an Euler circuit.
b) Show that if any vertex (and the edges incident to it) is
2. Characterize the type of graph in which an Euler trail (cir- removed from the Petersen graph, then the resulting sub-
cuit) is also a Hamilton path (cycle). graph has a Hamilton cycle.
11.5 Hamilton Paths and Cycles 563
(d) (e)
Figure 11.84
5. Consider the graphs in parts (d) and (e) of Fig. 11.84. Is it 11. a) Determine all nonisomorphic tournaments with three
possible to remove one vertex from each of these graphs so that vertices.
each of the resulting subgraphs has a Hamilton cycle? b) Find all of the nonisomorphic tournaments with four
6. If n > 3, how many different Hamilton cycles are there in vertices. List the in degree and the out degree for each ver-
the wheel graph W,,? (The graph W,, was defined in Exercise 14 tex, in each of these tournaments.
of Section 11.1.) 12. Prove that for n > 2, the hypercube Q, has a Hamilton
cycle.
7. a) Forn > 3, how many different Hamilton cycles are there
in the complete graph K,,? 13. Let T = (V, E) be a tournament with v € V of maximum
out degree. Ifw € V and w # v, prove that either (v, w) € FE or
b) How many edge-disjoint Hamilton cycles are there in
thereis a vertex yin V wherey # v, w, and (v, y), (y, w) € E.
(Such a vertex v is called a king for the tournament.)
c) Nineteen students in a nursery school play a game each
day where they hold hands to form a circle. For how many 14. Find a counterexample to the converse of Theorem 11.8.
days can they do this with no student holding hands with 15. Give an example of a loop-free connected undirected multi-
the same playmate twice? graph G = (V, E) such that |V| =” and deg(x) + deg(y) =
8. a) For n € Z*+, n > 2, show that the number of distinct n — 1 forall x, y € V, but G has no Hamilton path.
Hamilton cycles in the graph K,,, is (1/2)(n — 1)! a! 16. Prove Corollaries 11.4 and 11.5.
b) How many different Hamilton paths are there for K,, », 17. Give an example to show that the converse of Corollary 11.5
n> 1? need not be true.
9. Let G = (V, E) bea loop-free undirected graph. Prove that 18. Helen and Dominic invite 10 friends to dinner. In this group
if G contains no cycle of odd length, then G is bipartite. of 12 people everyone knows at least 6 others. Prove that the
10. a) Let G = (V, E) be a connected bipartite undirected 12 can be seated around a circular table in such a way that each
graph with V partitioned as V; U V3. Prove that if |Vi| # person is acquainted with the persons sitting on either side.
|V>|, then G cannot have a Hamilton cycle. 19. Let G = (V, E) be a loop-free undirected graph that is 6-
b) Prove that if the graph G in part (a) has a Hamilton path, regular. Prove that if |V| = 11, then G contains a Hamilton
then |V;| — |V2, =+1. cycle.
c) Give an example of a connected bipartite undirected 20. Let G = (V, E) be a loop-free undirected n-regular graph
graph G = (V, E), where V is partitioned as V; U V2 and with |V| > 2” + 2. Prove that G (the complement of G) has a
|V;| = |V¥2| — 1, but G has no Hamilton path. Hamilton cycle.
564 Chapter 11 An Introduction to Graph Theory
21. For n > 3, let C, denote the undirected cycle on n ver- b) Find £(G) for each graph in part (a).
tices. The graph C,,, the complement of C,,, is often called the c) Determine £(G) for each of the following graphs:
cocycle on n vertices. Prove that for 2 > 5 the cocycle C,, has
Gi) Ky3; (ii) Kz3; (iii) K32; (iv) Keas (Vv) Kae:
a Hamilton cycle.
(Vi) Knn,mne Ze.
22. Letn € Z* withn > 4, and let the vertex set V’ for the com-
d) Let 7 be an independent set in G = (V, E). What type
plete graph K,_; be {v), v2, U3, ..., Un_1}. Now construct the
of subgraph does / induce in G?
loop-free undirected graph G, = (V, E) from K,,_, as follows:
V = V'U {v}, and E consists of all the edges in K,_) except for
b
the edge {v), v2}, which is replaced by the pair of edges {v,, v}
and {v, v}.
a) Determine deg(x) + deg(¥) for all nonadjacent vertices
x and yin V.
b) Does G,, have a Hamilton cycle?
c) How large is the edge set E?
d) Do the results in parts (b) and (c) contradict Corol- (i 0 (i)
lary 11.6?
Figure 11.85
23. For n € Z* where n > 4, let V’ = {v), v2. U3, -.., Up_y}
be the vertex set for the complete graph K,_). Construct the
loop-free undirected graph H, = (V, £) from K,,_, as follows: 26. Let G = (V, E) be an undirected graph with subset J of
V =V’U {v}, and & consists of all the edges in K,,_; together V an independent set. For each a € J and each Hamilton cy-
with the new edge {v, v;}. cle C for G, there will be deg(a) — 2 edges in E that are
incident with @ and not in C. Therefore there are at least
a) Show that H,, has a Hamilton path but no Hamilton
> .c/[deg(a) — 2) = )-,., deg(a) — 2|/| edges in E that do
cycle.
not appear in C.
b) How large is the edge set FE?
a) Why are these }),., deg(a) — 2|1| edges distinct?
24. Letn = 2‘ fork € Z*. We use the n k-bit sequences (of 0’s
b) Letv = |V|, e = |E|. Prove that if
and 1’s) to represent 1, 2,3,...,, so that for two consecu-
tive integers i, i + 1, the corresponding k-bit sequences differ e— )*deg(a) + 2|/| <v,
in exactly one component. This representation is called a Gray aél
code (comparable to what we saw in Example 3.9).
then G has no Hamilton cycle.
a) For k = 3, use a graph model with V = {000, 001,
c) Select a suitable independent set / and use part (b) to
010,..., 111} to find such a code for 1, 2,3,...,8.
show that the graph in Fig. 11.86 (known as the Herschel
How is this related to the concept of a Hamilton path?
graph) has no Hamilton cycle.
b) Answer part (a) for k = 4.
25. If G = (V, £) is an undirected graph, a subset / of V is
called independent if no two vertices in J are adjacent. An in-
dependent set / is called maximal if no vertex v can be added
to J with J U {v} independent. The independence number of G,
denoted £(G), is the size of a largest independent set in G.
Figure 11.86
a) For each graph in Fig. 11.85 find two maximal indepen-
dent sets with different sizes.
11.6
Graph Coloring
and Chromatic Polynomials
At the J. & J. Chemical Company, Jeannette is in charge of the storage of chemical com-
pounds in the company warehouse. Since certain types of compounds (such as acids and
bases) should not be kept in the same vicinity, she decides to have her partner Jack par-
11.6 Graph Coloring and Chromatic Polynomials 565
tition the warehouse into separate storage areas so that incompatible chemical reagents
can be stored in separate compartments. How can she determine the number of storage
compartments that Jack will have to build?
If this company sells 25 chemical compounds, let {c;, c2...., C75} = V, asetof vertices.
For all | <i < j <25, we draw the edge {c;, c;} if c, and c, must be stored in separate
compartments. This gives us an undirected graph G = (V, E).
We now introduce the following concept.
Definition 11.22 If G = (V, £) is an undirected graph, a proper coloring of G occurs when we color the
vertices of G so that if {a, b} is an edge in G, then a and b are colored with different colors.
(Hence adjacent vertices have different colors.) The minimum number of colors needed to
properly color G is called the chromatic number of G and is written x (G).
Returning to assist Jeannette at the warehouse, we find that the number of storage
compartments Jack must build is equal to x(G) for the graph we constructed on V =
{c1, C2,..., €25}. But how do we compute x (G)? Before we present any work on how to
determine the chromatic number of a graph, we turn to the following related idea.
In Example 11.24 we mentioned the connection between coloring the regions in a planar
map (with neighboring regions having different colors) and properly coloring the vertices
in an associated graph. Determining the smallest number of colors needed to color planar
maps in this way has been a problem of interest for over a century.
In about 1850, Francis Guthrie (1831-1899) became interested in the general problem
after showing how to color the counties on a map of England with only four colors. Shortly
thereafter, he showed the “Four-color Problem” to his younger brother Frederick (1833-
1866), who was then a student of Augustus DeMorgan (1806-1871). DeMorgan communi-
cated the problem (in 1852) to William Hamilton (1805-1865). The problem did not interest
Hamilton and lay dormant for about 25 years. Then, in 1878, the scientific community was
made aware of the problem through an announcement by Arthur Cayley (1821-1895) at a
meeting of the London Mathematical Society. In [879 Cayley stated the problem in the first
volume of the Proceedings of the Royal Geographical Society. Shortly thereafter, the British
barrister (and keen amateur mathematician) Sir Alfred Kempe (1849-1922) devised a proof
that remained unquestioned for over a decade. In 1890, however, the British mathematician
Percy John Heawood (1861-1955) found a mistake in Kempe’s work.
The problem remained unsolved until 1976, when it was finally settled by Kenneth
Appel and Wolfgang Haken. Their proof employs a very intricate computer analysis of
1936 (reducible) configurations.
Although only four colors are needed to properly color the regions in a planar map, we
need more than four colors to properly color the vertices of some nonplanar graphs.
We start with some small examples. Then we shall find a way to determine x (G) from
smaller subgraphs of G —in certain situations. [In general, computing x(G) is a very
difficult problem.] We shall also obtain what is called the chromatic polynomial for G and
see how it can be used in computing x (G).
| EXAMPLE 11.31 For the graph G in Fig. 11.87, we start at vertex a and next to each vertex write the number
of a color needed to properly color the vertices of G that have been considered up to that
point. Going to vertex b, the 2 indicates the need for a second color because vertices a
and b are adjacent. Proceeding alphabetically to f, we find that two colors are needed to
566 Chapter 11 An Introduction to Graph Theory
properly color {a, b, c,d, e, f}. For vertex g a third color is needed; this third color can
also be used for vertex h because {g, h} is not an edge in G. Thus this sequential coloring
(labeling) method gives us a proper coloring for G, so x(G) < 3. Since K3 is a subgraph
of G [for example, the subgraph induced by a, b and g is (isomorphic to) K3], we have
x(G) > 3, so x(G) = 3.
AR
é,1 |
,2
Figure 11.87
a) For alln > 1, x(K,) =n.
EXAMPLE 11.32
b) The chromatic number of the Herschel graph (Fig. 11.86) is 2.
c) If G is the Petersen graph [see Fig. 11.52 (a)], then x(G) = 3.
Let G be the graph shown in Fig. 11.88. For U = {b, f, h, i}, the induced subgraph (U)
EXAMPLE 11.33
of G is isomorphic to Ky, so x(G) > x (K4) = 4. Therefore, if we can determine a way to
properly color the vertices of G with four colors, then we shall know that x (G) = 4. One
way to accomplish this is to color the vertices e, f, g blue; the vertices b, j red; the vertices
c, h white; and the vertices a, d, i green.
a b c
a2.
h i j
Figure 11.88
We turn now to a method for determining x (G). Our coverage follows the development
in the survey article [25] by R. C. Read.
Let G be an undirected graph, and let 4 be the number of colors that we have available
for properly coloring the vertices of G. Our objective is to find a polynomial function
P(G, 4), in the variable 1, called the chromatic polynomial of G, that will tell us in how
many different ways we can properly color the vertices of G, using at most A colors.
Throughout this discussion, the vertices in an undirected graph G = (V, E) are distin-
guished by labels. Consequently, two proper colorings of such a graph will be considered
different in the following sense: A proper coloring (of the vertices of G) that uses at most A
colors is a function f, with domain V and codomain {1, 2, 3,..., A}, where f(u) # f(v),
11.6 Graph Coloring and Chromatic Polynomials 567
for adjacent vertices u, v € V. Proper colorings are then different in the same way that these
functions are different.
a) If G = (V, E) with |V| =n and E = Y, then G consists of n isolated points, and by
EXAMPLE 11.34 the rule of product, P(G, A) = A".
b) lf G = K,, then at least n colors must be available for us to color G properly. Here,
by the rule of product, P(G, 4) = A(A — 1)(A — 2)--- (A —n 4+ 1), which we de-
note by 4°, For A <n, P(G, 4) = 0 and there are no ways to properly color Ky.
P(G, 4) > 0 for the first time when A = n = x(G).
c) For each path in Fig. I!.89, we consider the number of choices (of the 4 colors) at
each successive vertex. Proceeding alphabetically, we find that P(G;, 4) = A(A — 1)
and P(G2, A) =A(A — 1)*. Since P(G;. 1) = 0 = P(Go, 1), but P(G,, 2) =2 =
P(G2, 2), it follows that ¥(G;) = ~(G2) = 2. If five colors are available we can
properly color G; in 5(4)> = 320 ways; G2 can be so colored in 5(4)* = 1280 ways.
adxy-1 a,X
er-1 brA—1
CA 1 bX - 1 drx»-1 ¢A-1
(G1) (G>)
Figure 11.89
In general, if G is a path on n vertices, then P(G, A) = A(A — 1)"7!.
d) If G is made up of components G;, G2, .... G,, then again by the rule of product, it
follows that P(G, 24) = P(G;, A)- P(G2, A) --+ P(G,, A).
As a result of Example 11.34(d), we shall concentrate on connected graphs. In many
instances in discrete mathematics, methods have been employed to solve problems in large
cases by breaking these down into two or more smaller cases. Once again we use this method
of attack. To do so, we need the following ideas and notation.
Let G = (V, E) be an undirected graph. For e = {a, b} € E, let G, denote the subgraph
of G obtained by deleting e from G, without removing vertices a and b; that is, G, = G — e
as defined in Section 11.2. From G, a second subgraph of G is obtained by coalescing (or,
identifying) the vertices a and b. This second subgraph is denoted by G’..
EXAMPLE 11.35 Figure 11.90 shows G, and G’, for graph G with the edge e as specified. Note how the
. coalescing of a and b in G/ results in the coalescing of the two pairs of edges {d, b}, {d, a}
and {a, c}, {b, c}.
568 Chapter 11 An Introduction to Graph Theory
a Cc a Cc a (=p) C
e
d b d b d
G Ge Ge
Figure 11.90
Using these special subgraphs, we turn now to the main result.
THEOREM 11.10 Decomposition Theorem for Chromatic Polynomials. If G = (V, E) is a connected graph
and e € £, then
P(G,, A) = P(G,A) + P(G!, A).
Proof: Let e = {a, b}. The number of ways to properly color the vertices in G, with (at
most) A colors is P(G,, 4). Those colorings where a and b have different colors are proper
colorings of G. The colorings of G, that are not proper colorings of G occur when a and
b have the same color. But each of these colorings corresponds with a proper coloring for
G‘,. This partition of the P(G,. A) proper colorings of G, into two disjoint subsets results
in the formula P(G,, A) = P(G.A) + P(G%, d).
When calculating chromatic polynomials, we shall place brackets about a graph to indi-
cate its chromatic polynomial.
The following calculations yield P(G, 4) for G a cycle of length 4.
EXAMPLE 11.36
a b a Db a b (=d)
o——__-e
é = —
o—_—_-e
C d C d C
P(G, d) P(Ge, d) P(G,, d)
From Example 11.34(c) it follows that P(G,, 4) = A(A — 1)°. With G', = K3 we have
P(G), A) =A. Therefore,
P(G,A) =AQA— 1 —AA- DOA -2) =AaQAa—- IIA -1)*% — A —2)]
=A(A — IA? — 30.43] = a4 ~— 403 4 6A? — 3d.
Since P(G, 1) = 0 while P(G, 2) = 2 > 0, we know that ¥(G) = 2.
11.6 Graph Coloring and Chromatic Polynomials 569
Here we find a second application of Theorem 11.10.
EXAMPLE 11.37
Vv Vv
e<
e
Ww x w 7 ey x Ww x (=v) w x w x
KAUN RN NS
= (AA) — 2014 = (A — 2) = AA — - TDA — 2)2(4 - 3)
For the disconnected graph
with the components Kj), Kq
Foreach! <A <3, P(G, 4) = 0, but P(G, A) > OforallA > 4. Consequently, the given
graph has chromatic number 4.
The chromatic polynomials given in Examples 11.36 and [1.37 suggest the following
results,
THEOREM 11.11 For each graph G, the constant term in P(G, A) is 0.
Proof: For each graph G, x¥(G) > 0 because V # @. If P(G, A) has constant term a, then
P(G, 0) =a #0. This implies that there are a ways to color G properly with 0 colors, a
contradiction.
THEOREM 11.12 Let G = (V, FE) with |Z| > 0. Then the sum of the coefficients in P(G, ) is 0.
Proof: Since |E| > 1, we have x (G) > 2, so we cannot properly color G with only one color.
Consequently, P(G. 1) = 0 = the sum of the coefficients in P(G, A).
Since the chromatic polynomial of a complete graph is easy to determine, an alternative
method for finding P(G, 4) can be obtained. Theorem 11.10 reduced the problem to smaller
graphs. Here we add edges to a given graph until we reach complete graphs.
THEOREM 11.13 Let G =(V, E), with a, b € V but {a, b} = e ¢ E. We write G? for the graph we obtain
from G by adding the edge e = {a, b}. Coalescing the vertices a and b in G gives us the
subgraph G3* of G. Under these circumstances P(G, 4) = P(G7, 4) + P(GI*, A).
Proof: This result follows as in Theorem 11.10 because P(G7, A) = P(G, 4) — P(G#t, A).
570 Chapter 11 An Introduction to Graph Theory
Let us now apply Theorem 11.13.
EXAMPLE 11.38
b d b d b (=a) d
P(G,d) P(G¢, A) P(GE*, A)
Here P(G, A) = AM +49 =2(A — IA — 2), so x (G) = 3. In addition, if six colors
are available, the vertices in G can be properly colored in 6(5)(4)* = 480 ways.
Our next result again uses complete graphs — along with the following concepts.
For all graphs G; = (V,, £,) and G2 = (V2, F2).
i) the union of G; and G2, denoted G, U Gz, is the graph with vertex set V; U V2 and
edge set FE, U Eo; and
ii) when V, 1 V2 # @, the intersection of G; and G2, denoted G; M Go, is the graph
with vertex set V; M V2 and edge set £, M Fo.
THEOREM 11.14 Let G be an undirected graph with subgraphs G,, G2. IfG = G; U G2 and G; M G2 = Ky,
for some n € Zt, then
P(G,A) = P(G,, A)wm+ P(G2, d)
Proof: Since G; 1 G2 = Ky, it follows that K,, is a subgraph of both G,; and G2 and that
x (G1), x(G2) = n. Given A colors, there are A proper colorings of K,,. For each of these
4 colorings there are P(G;, 4)/A™ ways to properly color the remaining vertices in G).
Likewise, there are P(G2, i) /A ways to properly color the remaining vertices in G2. By
the rule of product,
P(Gi,A) P(G2,4) _ P(G1,A)- P(G2, A)
P(G, i) = P(Ky, A) - 1” Qt) AM
Consider the graph in Example I!.37. Let G; be the subgraph induced by the vertices
EXAMPLE 11.39
w, x, y, z. Let G2 be the complete graph K3 — with vertices v, w, and x. Then G; M G is
the edge {w, x}, so G,) NG2 = K2.
Therefore
P(G.4) = P(G,,A)-
(Gi, A)- P(Go,
P(G2 4 _ AM 42)
A@ 2
— A = 1)? — 2)? — 3)
ACA — 1)
AA — 1) — 2)7(A — 3),
agreeing with the answer obtained in Example 11.37.
11.6 Graph Coloring and Chromatic Polynomials 571
Much more can be said about chromatic polynomials — in particular, there are many
unanswered questions. For example, no one has found a set of conditions that indicate
whether a given polynomial in 4 is the chromatic polynomial for some graph. More about
this topic is introduced in the article by R. C. Read [25].
6. a) Consider the graph K23 shown in Fig. 11.91, and let
EXERCISES 11.6 4 € Z denote the number of colors available to properly
color the vertices of Kz 3. (i) How many proper colorings
1. A pet-shop owner receives a shipment of tropical fish.
of K23 have vertices a, b colored the same? (ii) How many
Among the different species in the shipment are certain pairs
proper colorings of Ky3 have vertices a, b colored with
where one species feeds on the other. These pairs must conse-
different colors?
quently be kept in different aquaria. Model this problem as a
graph-coloring problem, and tell how to determine the smallest b) What is the chromatic polynomial for K 3? What is
number of aquaria needed to preserve all the fish in the ship- X(K73)?
ment. c) For € Z*, what is the chromatic polynomial for K2,,?
What is x (K2,,)?
2. As the chair for church committees, Mrs. Blasi is faced with
scheduling the meeting times for 15 committees. Each commit-
tee meets for one hour each week. Two committees having a x
common member must be scheduled at different times. Model
a
this problem as a graph-coloring problem, and tell how to de-
termine the least number of meeting times Mrs. Blasi has to
y
consider for scheduling the 15 committee meetings.
3. a) Atthe J. & J. Chemical Company, Jeannette has received b
three shipments that contain a total of seven different chem- Zz
icals. Furthermore, the nature of these chemicals is such Figure 11.91
that for all 1 <7 <5, chemical i cannot be stored in the
same storage compartment as chemical 7 + 1 or chemical
i +2, Determine the smallest number of separate storage
7. Find the chromatic number of the following graphs.
compartments that Jeannette will need to safely store these
seven chemicals. a) The complete bipartite graphs Ky,,.
b) Suppose that in addition to the conditions in part (a), b) Acycle on # vertices, n > 3,
the following four pairs of these same seven chemicals also c) The graphs in Figs. 11.59(d), 11.62(a), and 11.85.
require separate storage compartments: {| and 4, 2 and 5, 2 d) The n-cube Q,,n > 1,
and 6, and 3 and 6. What is the smallest number of storage
compartments that Jeannette now needs to safely store the 8. If G is a loop-free undirected graph with at least one edge,
seven chemicals? prove that G is bipartite if an only if x (G) = 2.
4. Give anexample of an undirected graph G = (V, E), where 9. a) Determine the chromatic polynomials for the graphs in
x(G) = 3 but no subgraph of G is isomorphic to K3. Fig. 11.92
5. a) Determine P(G, A) for G = K)3. b) Find x (G) for each graph.
b) Forn € Z, what is the chromatic polynomial for K,,,,? c) If five colors are available, in how many ways can the
What is its chromatic number? vertices of each graph be properly colored?
t
w x
Ww
t
w x y 2 y x Z y
(a) (b) (c)
Figure 11.92
572 Chapter 11 An Introduction to Graph Theory
x1 XQ XB Xn-1 Xn
10. a) Determine whether the graphs in Fig. 11.93 are isomor-
phic.
b) Find P(G, i) for each graph.
¢) Comment on the results found in parts (a) and (b).
Yi ¥2 ¥3 Yn-1 Yn
' Figure 11.94
15. For n > 3, let C, denote the cycle of length n.
a) What is P(C3, A)?
g j b) If n > 4, show that
P(C,,4) = P(Pa-1, A) — P(Cn-1, 4),
k where P,,_, denotes the path of length n — 1.
c) Verify that P(P,-), A) = A(A — 1)"7!", for all a > 2.
u
d) Establish the relations
P(C,,A)
— A= 1)" = (A— 19"! — P(Cn-1.
4), n> 4,
v y P(Cy, A) — (A= 1)” = P(Cy-2. AV -— (A 1",
e) Prove that for all n > 3,
P(C,, A) = (A- 1)" + (-1)"QA— 1).
Zz
16. Forn > 3, recall that the wheel graph, W,,, is obtained from
acycle of length n by placing a new vertex within the cycle and
Figure 11.93
adding edges (spokes) from this new vertex to each vertex of
the cycle.
11. For n > 3, let G, =(V, E) be the undirected graph ob- a) What relationship is there between x (C,,) and x(W,,)?
tained from the complete graph K,, upon deletion of one edge.
b) Use part (e) of Exercise 15 to show that
Determine P(G,, 4) and x(G,).
12. Consider the complete graph K,, for n > 3. Color r of the P(W,, A) = ACA — 2)" + (-1)"A(A — 2).
vertices in K, red and the remaining — r (= g) vertices green. c) i) Ifwehave k different colors available, in how many
For any two vertices v, w in K,, color the edge {v, w} (1) red if ways can we paint the walls and ceiling of a pen-
v, w are both red; (2) green if v, w are both green; or (3) blue tagonal room if adjacent walls, and any wall and the
if v, w have different colors. Assume that r > g. ceiling, are to be painted with different colors?
a) Show that for r = 6 and g = 3 (and n = 9) the total ii) What is the smallest value of k for which such a
number of red and green edges in Ky equals the number of coloring is possible?
blue edges in Ko. 17. Let G = (V, E) be a loop-free undirected graph with chro-
b) Show that the total number of red and green edges in matic polynomial P(G, 4) and |V| = n. Use Theorem 11.13 to
K,, equals the number of blue edges in K,, if and only if prove that P(G, A) has degree n and leading coefficient | (that
n=r-+g, where g, r are consecutive triangular numbers. is, the coefficient of 2” is 1),
[The triangular numbers are defined recursively by t) =
18. Let G = (V, FE) be a loop-free undirected graph.
ltep=th+(a+1),n>1; sot, =n(n + 1)/2. Hence
th=1,6f=3,4h =6,....] a) For each such graph, where |V| <3, find P(G, 4) and
show that in it the terms contain consecutive powers of A.
13. Let G=(V, FE) be the undirected connected “ladder
Also show that the coefficients of these consecutive powers
graph” shown in Fig. 11.94.
alternate in sign.
a) Determine |V| and ||.
b) Now consider G = (V, E), where |V| =n >4 and
b) Prove that P(G, A) = A(A — 1)(A* — 3A +.3)"7)7 |E| =k. Prove by mathematical induction that the terms
14. Let G be a loop-free undirected graph, where A = in P(G, A) contain consecutive powers of 4 and that the
max,<y {deg(v)}. (a) Prove that x(G) < A+ 1. (b) Find two coefficients of these consecutive powers alternate in sign.
types of graphs G, where x(G) = A+ 1. [For the induction hypothesis, assume that the result is
11.7. Summary and Historical Review 573
true for all loop-free undirected graphs G = (V, £), where b) Forn € Z*, n > 2, which of the complete graph K,, are
either (i) |V| =” — 1 or Gi) |V| =x”, but |E| =k — 1.) color-critical?
c) Prove that if |V| =n, then the coefficient of 4”~! in c) Prove that a color-critical graph must be connected.
P(G, i) is the negative of |£|. d) Prove that if G is color-critical with x(G) =k, then
19, Let G = (V, E) be a loop-free undirected graph. We call G deg(v) > k — 1 forallve V.
color-critical if y(G) > x(G — v) forall ve V.
a) Explain why cycles with an odd number of vertices are
color-critical while cycles with an even number of vertices
are not color-critical.
11.7
Summary and Historical Review
Unlike other areas in mathematics, graph theory traces its beginnings to a definite time
and place: the problem of the seven bridges of K6nigsberg, which was solved in 1736 by
Leonhard Euler (1707-1783). And in 1752 we find Euler’s Theorem for planar graphs. (This
result was originally presented in terms of polyhedra.) However, after these developments,
little was accomplished in this area for almost a century.
Then, in 1847, Gustav Kirchhoff (1824-1887) examined a special type of graph called
a tree. (A tree is a loop-free undirected graph that is connected but contains no cycles.)
Kirchhoff used this concept in applications dealing with electrical networks in his extension
of Ohm’s laws for electrical flow. Ten years later Arthur Cayley (1821-1895) developed
this same type of graph in order to count the distinct isomers of the saturated hydrocarbons
C, Hoy 42, AE Zz.
This period also saw two other major ideas come to light. The four-color conjecture was
first investigated by Francis Guthrie (183 !—1899) in about 1850. In Section 11.6 we related
some of the history of this problem, which was solved via an intricate computer analysis in
1976 by Kenneth Appel and Wolfgang Haken.
The second major idea was the Hamilton cycle. This cycle is named for Sir William
Rowan Hamilton (1805-1865), who used the idea in 1859 for an intriguing puzzle that used
the edges on a regular dodecahedron. A solution to this puzzle is not very difficult to find,
but mathematicians still search for necessary and sufficient conditions to characterize those
undirected graphs that possess a Hamilton path or cycle.
Following these developments, we find little activity until after 1920. The characteriza-
tion of planar graphs was solved by the Polish mathematician Kasimir Kuratowski (1896—
1980) in 1930. In 1936 we find the publication of the first book on graph theory, written
by the Hungarian mathematician Dénes K6nig (1884-1944), a prominent researcher in the
field. Since then there has been a great deal of activity in the area, the computer providing
assistance in the last five decades. Among the many contemporary researchers (not men-
tioned in the chapter references) in this and related fields one finds the names Claude Berge,
V. Chvatal, Paul Erdés, Laszlo Lovasz, W. T. Tutte, and Hassler Whitney.
Comparable coverage of the material presented in this chapter is contained in Chapters
6, 8, and 9 of C. L. Liu [23]. More advanced work is found in the works by J. A. Bondy
and U.S. R. Murty [10], N. Hartsfield and G. Ringel [20], and D. B. West [32]. The book
by F. Buckley and F. Harary [11] revises the classic work of F. Harary [18] and brings the
reader up to date on the topics covered in the original 1969 work. The text by G. Chartrand
and L. Lesniak [12] provides a more algorithmic approach in its presentation. A proof of
574 Chapter 11 An Introduction to Graph Theory
William Rowan Hamilton (1805-1865) Paul Erdés (1913-1996)
Reproduced courtesy of The Granger Collection, New York Reproduced courtesy of Christopher Barker
Kuratowski’s Theorem appears in Chapter 8 of C. L. Liu [23] and Chapter 6 of D. B. West
[32]. The article by G. Chartrand and R. J. Wilson [13] develops many important concepts in
graph theory by focusing on one particular graph — the Petersen graph. This graph (which
we mentioned in Section | 1.4) is named for the Danish mathematician Julius Peter Christian
Petersen (1839-1910), who discussed the graph in a paper in 1898.
Applications of graph theory in electrical networks can be found in S. Seshu and M. B.
Reed [30]. In the text by N. Deo [14], applications in coding theory, electrical networks, op-
erations research, computer programming, and chemistry occupy Chapters 12—15. The text
by F. S. Roberts [26] applies the methods of graph theory to the social sciences. Applications
of graph theory in chemistry are given in the article by D. H. Rouvray [29].
More on chromatic polynomials can be found in the survey article by R. C. Read [25].
The role of Polya’s theory” in graphical enumeration is examined in Chapter 10 of N. Deo
[14]. A thorough coverage of this topic is found in the text by F. Harary and E. M. Palmer
[19].
Additional coverage on the historical development of graph theory is given in N. Biggs,
E. K. Lloyd, and R. J. Wilson [9].
Many applications in graph theory involve large graphs that require the computationally
intensive talents of a computer in conjunction with the ingenuity of mathematical methods.
Chapter 11 of N. Deo [14] presents computer algorithms dealing with several of the graph-
theoretic properties we have studied here. Along the same line, the text by A. V. Aho, J. E.
Hopcroft, and J. D. Ullman [1] provides even more for the reader interested in computer
science,
As mentioned at the end of Section 11.5, the traveling salesman problem is closely related
to the search for a Hamilton cycle in a graph. This is a graph-theoretic problem of interest
in both operations research and computer science. The article by M. Bellmore and G. L.
"We shall introduce the basic ideas behind this method of enumeration in Chapter 16.
References 575
Nemhauser [8] provides a good introductory survey of results on this problem. The text
by R. Bellman, K. L. Cooke, and J. A. Lockett [7] includes an algorithmic treatment of
this problem along with other graph problems. A number of heuristics for obtaining an
approximate solution to the problem are given in Chapter 4 of the text by L. R. Foulds [17].
The text edited by E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and D. B. Shmoys
[22] contains 12 papers dealing with various aspects of this problem, including historical
considerations as well as some results on computational complexity. Applications, where a
robot visits different locations in an automated warehouse in order to fill a given order, are
examined in the articles by E. A. Elsayed [15] and by E. A. Elsayed and R. G. Stern [16].
The solution of the four-color problem can be examined further by starting with the
paper by K. Appel and W. Haken [3]. The problem, together with its history and solution, is
examined in the text by D. Barnette [6] and in the Scientific American article by K. Appel
and W. Haken [4]. The proof uses a computer analysis to handle a large number of cases; the
article by T. Tymoczko [31] examines the role of such techniques in pure mathematics. In
[5] K. Appel and W. Haken further examine their proof in the light of the computer analysis
that was used. The articles by N. Robertson, D. P. Sanders, P. D. Seymour, and R. Thomas
[27, 28] provide a simplified proof. In 1997 their computer code was made available on the
Internet. This code could prove the four-color problem on a desktop workstation in roughly
three hours.
Finally, the article by A. Ralston [24] demonstrates some of the connections among
coding theory, combinatorics, graph theory, and computer science.
REFERENCES
1. Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffrey D. Data Structures and Algorithms.
Reading, Mass.: Addison-Wesley, 1983.
2. Ahuja, Ravindra K., Magnanti, Thomas L., Orlin, James B., and Reddy, M. R.“Applications
of Network Optimization.” In M. O. Ball, Thomas L. Magnanti, C. L. Monma, and G. L.
Nemhauser, eds., Handbooks in Operations Research and Management Science, Vol. 7, Net-
work Models. Amsterdam, Holland: Elsevier, 1995, pp. 1-83.
3. Appel, Kenneth, and Haken, Wolfgang. “Every Planar Map Is Four Colorable.” Bulletin of the
American Mathematical Society 82 (1976): pp. 711-712.
4. Appel, Kenneth, and Haken, Wolfgang.“‘The Solution of the Four-Color-Map Problem. ” Sci-
entific American 237 (October 1977): pp. 108-121.
5. Appel, Kenneth, and Haken, Wolfgang.“The Four Color Proof Suffices.” Mathematical Intel-
ligencer 8, no. 1 (1986): pp. 10- 20.
6. Barnette, David. Map Coloring, Polyhedra, and the Four-Color Problem. Washington, D.C.:
The Mathematical Association of America, 1983.
7. Bellman, R., Cooke, K. L., and Lockett, J. A. Algorithms, Graphs, and Computers. New York:
Academic Press, 1970.
8. Bellmore, M., and Nemhauser, G. L.“The Traveling Salesman Problem: A Survey.” Operations
Research 16 (1968): pp. 538-558.
9. Biggs, N., Lloyd, E. K., and Wilson, R. J. Graph Theory (1736-1936). Oxford, England:
Clarendon Press, 1976.
10. Bondy, J. A., and Murty, U.S. R. Graph Theory with Applications. New York: Elsevier North-
Holland, 1976.
11. Buckley, Fred, and Harary, Frank. Distance in Graphs. Reading, Mass.: Addison-Wesley, 1990.
12. Chartrand, Gary, and Lesniak. Linda. Graphs and Digraphs, 3rd ed. Boca Raton, Fla.: CRC
Press, 1996,
13. Chartrand, Gary, and Wilson, Robin J.“The Petersen Graph.” In Frank Harary and John S.
Maybee, eds., Graphs and Applications. New York: Wiley, 1985.
576 Chapter 11 An Introduction to Graph Theory
14. Deo, Narsingh. Graph Theory with Applications to Engineering and Computer Science. En-
glewood Cliffs, N. J.: Prentice-Hall, 1974.
15. Elsayed, E. A.““Algorithms for Optimal Material Handling in Automatic Warehousing Sys-
tems.” Jnt. J. Prod. Res. 19 (1981): pp. 525-535.
16. Elsayed, E. A., and Stern, R. G.“Computerized Algorithms for Order Processing in Automated
Warehousing Systems.” /nt. J. Prod. Res. 21 (1983): pp. 579-586.
17. Foulds, L. R. Combinatorial Optimization for Undergraduates. New York: Springer-Verlag,
1984,
18. Harary, Frank. Graph Theory. Reading, Mass.: Addison-Wesley, 1969.
19. Harary, Frank, and Palmer, Edgar M. Graphical Enumeration. New York: Academic Press,
1973.
20. Hartsfield, Nora, and Ringel, Gerhard. Pearls in Graph Theory: A Comprehensive Introduction.
Boston, Mass.: Harcourt/Academic Press, 1994.
21. Jiinger, M., Reinelt, G., and Rinaldi, G.““‘The Traveling Salesman Problem.” In M. O. Ball,
Thomas L. Magnanti, C. L. Monma, and G. L. Nemhauser, eds., Handbooks in Operations
Research and Management Science, Vol. 7, Network Models. Amsterdam, Holland: Elsevier,
1995, pp. 225-330.
22. Lawler, E. L., Lenstra, J. K., Rinnooy Kan, A. H. G., and Shmoys, D. B., eds. The Traveling
Salesman Problem. New York: Wiley, 1986.
23. Liu, C. L. /ntroduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
24. Ralston, Anthony. “De Bruijn Sequences —A Model Example of the Interaction of Discrete
Mathematics and Computer Science.” Mathematics Magazine 55, no. 3 (May 1982): pp. 131-
143.
25. Read, R. C.““An Introduction to Chromatic Polynomials.” Journal of Combinatorial Theory 4
(1968): pp. 52-71.
26. Roberts, Fred S. Discrete Mathematical Models. Englewood Cliffs, N. J.: Prentice-Hall, 1976.
27. Robertson, N., Sanders, D. P., Seymour, P. D., and Thomas, R. “Efficiently Four-coloring
Planar Graphs.” Proceedings of the 28th ACM Symposium on the Theory of Computation.
ACM Press (1996): pp. 571-575.
28. Robertson, N., Sanders, D. P., Seymour, P. D., and Thomas, R. “The Four-color Theorem.”
Journal of Combinatorial Theory Series B70 (1997): pp. 166-183.
29. Rouvray, Dennis H. “Predicting Chemistry from Topology.” Scientific American 255, no. 3
(September 1986): pp. 40-47.
30. Seshu, S., and Reed, M. B. Linear Graphs and Electrical Networks. Reading, Mass.: Addison-
Wesley, 1961.
31. Tymoczko, Thomas. “Computers, Proofs and Mathematicians: A Philosophical Investigation
of the Four-Color Proof.” Mathematics Magazine 53, no. 3 (May 1980): pp. 131-138.
32. West, Douglas B. Introduction to Graph Theory, 2nd ed. Upper Saddle River, N.J.: Prentice-
Hall, 2001.
b) Prove that in any group of six people there must be three
SUPPLEMENTARY EXERCISES who are total strangers to one another or three who are mu-
tual friends.
4. a) LetG =(V, £) bea loop-free undirected graph. Recall
1. Let G be a loop-free undirected graph on n vertices. If G
that G is called self-complementary if G and G are iso-
has 56 edges and G has 80 edges, what is n?
morphic. If G is self-complementary (i) determine |£| if
2. Determine the number of cycles of length 4 in the hyper- |V| =n; (ii) prove that G is connected.
cube Q,,. b) Let n€Z*, where n= 4k (KE Z*) or n=4k +1
(k € N). Prove that there exists a self-complementary graph
3. a) If the edges of K¢ are painted either red or blue, prove
G =(V, E). where |V| =n.
that there is a red triangle or a blue triangle that is a sub-
graph.
Supplementary Exercises 577
5. a) Show that the graphs G, and Go, in Fig. 11.95, are iso- b) Verify that |V| is the sum of the independence number
morphic. of G (as defined in Exercise 25 for Section 11.5) and its
b) How many different isomorphisms f:G, > Gz are covering number.
possible here? 10. If G = (V, £) is an undirected graph, a subset D of V is
called a dominating set if for all v € V, either v € D or v is
adjacent to a vertex in D. If D is adominating set and no proper
u subset of D has this property, then D is called minimal. The size
] 2 3 of any smallest dominating set in G is denoted by y(G) and is
called the domination number of G.
z v
a) If G has no isolated vertices, prove that if D is a minimal
dominating set, then V — D is a dominating set.
y w
b) If J C V is independent, prove that / is a dominating
set if and only if J is maximal independent.
4 5 6
x c) Show that y(G) < B(G), and that |V| < 6(G)x(G).
(G)) (Gp) [Here 6(G) is the independence number of G — first given
Figure 11.95 in Exercise 25 of Section 11.5.]
ll. Let G=(V,£) be the undirected connected “ladder
6. Are any of the planar graphs for the five Platonic solids graph” shown in Fig. 11.94. Forn > 0, let a, denote the number
bipartite? of ways one can select n of the edges in G so that no two edges
share a common vertex. Find and solve a recurrence relation
7. a) How many paths of length 5 are there in the com- for a,.
plete bipartite graph K3.7? (Remember that a path such
as Uy > U2 > U3 —> U4 > Us > Ve Is Considered to be the 12. Consider the four comb graphs in parts (i), (ii), (iii), and
same as the path vg > vs > v4 > 03 > U2 > V1.) (iv) of Fig. 11.96. These graphs have 1 tooth, 2 teeth, 3 teeth,
and n teeth, respectively. For n > 1, let a, count the number of
b) How many paths of length 4 are there in K37?
independent subsets in {x;, x2, .... Xn. Vis Y2.---+ Yn}. Find
c) Let m,n, p €Z* with 2m <n and 1 < p< 2m. How and solve a recurrence relation for a,.
many paths of length p are there in the complete bipartite
graph K,,.,?
8. LetX = {1, 2, 3,...,”}, wheren > 2. Construct the loop-
x) X, Xp Xy XQ
free undirected graph G = (V, EF) as follows:
e (V): Each two-element subset of X determines a vertex
of G.
© (E): If vy}, v2 € V correspond to subsets {a, b} and {c, d},
respectively, of X, draw the edge {v,, v2} in G when
y1 yi 2 ¥1 Yor ¥3
{a, b} N {e, d} = &.
(1) (1i) (iil)
a) Show that G is an isolated vertex when n = 2 and that
G is disconnected for n = 3, 4. x x2 X3 Xn-1 Xn
b) Show that for n > 5, G is connected. (In fact, for all
v,, v2 € V, either {v), v2} € E or there is a path of length 2
connecting v; and v2.)
c) Prove that G is nonplanar for n > 5. ¥y, 2 3 Yn-1 Yn
d) Prove that for n > 8, G has a Hamilton cycle. (iv)
9. If G = (V, E) is an undirected graph, a subset K of V is Figure 11.96
called a covering of G if for every edge {a, b} of G either a or
bisin K. The set K is a minimal covering if K — {x} fails to
cover G for each x € K. The number of vertices in a smallest 13. Consider the four graphs in parts (i), (ii), (iii), and (iv) of
covering is called the covering number of G. Fig. 11.97. If a, counts the number of independent subsets of
a) Prove that if / C V, then / is an independent set in G if {X], X20, 0665 Xne Vis V2v eee yn}, Wheren > 1, find and solve a
and only if V — / is a covering of G. recurrence relation for a@,.
578 Chapter 11 An Introduction to Graph Theory
where we join two vertices e;, e2 in L(G) if and only if e,, e2
x) XxX, X2 xX, XQ 3
are adjacent edges in G.
a) Find L(G) for each of the graphs in Fig. 11.99.
b) Assuming that |V| = and |E| = e, show that L(G)
has e vertices and (1/2) }°,.y deg(v)[deg(v) — 1] =
(I)
y4
(in)
Yi 2 ¥i
(in)
Y2 $3
[(1/2) Prevideg@)P1 —e = Dev (“8”) edges.
xy x2 x3 Xn-1 Xn
¥1 2 ¥3 Yn-1 Yn
(iv)
Figure 11.97 a b WwW x
(a) (b)
Figure 11.99
14. For n > 1, let a, = (5), the number of edges in K,, and let
ay = 0. Find the generating function f(x) = ))™5 a,x".
15. For the graph G in Fig. 11.98, answer the following ques- c) Prove that if G has an Euler circuit, then L(G) has both
tions. an Euler circuit and a Hamilton cycle.
a) What are y(G), 8(G), and x(G)? d) If G = Ky, examine L(G) to show that the converse of
b) Does G have an Euler circuit or a Hamilton cycle? part (c) is false.
c) Is G bipartite? Is it planar? e) Prove that ifG has a Hamilton cycle, then so does L(G).
f)} Examine L(G) for the graph in Fig. 11.99(b) to show
that the converse of part (e) is false.
g) Verify that L(G) isnonplanarforG = KsandG = K33.
h) Give an example of a graph G, where G is planar but
L(G) is not.
19, Explain why each of the following polynomials in 4 cannot
be a chromatic polynomial.
a) At — 54° 4+ 747-6043
Figure 11.98 b) 34° - 447 +A
c) At — 343 +52? - 42
16. a) Suppose that the complete bipartite graph K,,,,, con- 20. a) For all x, y € Z*. prove that x+y — xy? is even.
tains 16 edges and satisfies m <n. Determine m, 7 so that
b) Let V = {1, 2, 3,..., 8,9}. Construct the loop-free
Km» possesses (i) an Euler circuit but not a Hamilton cycle;
undirected graph G = (V, E) as follows: For m, n€V,
(ii) both a Hamilton cycle and an Euler circuit.
m #n, draw the edge {m,n} in G if 5 divides m +n or
b) Generalize the results of part (a). m— fh.
17. If G = (V, E) is an undirected graph, any subgraph of G c) Given any three distinct positive integers, prove that
that is a complete graph is called a clique in G. The number of there are two of these, say x and y, where 10 divides
vertices in a largest clique in G is called the clique number for xy — xy,
G and is denoted by w(G).
21. a) For n> 1, let P,_; denote the path made up of n ver-
a) How are x(G) and w(G) related? tices and — 1 edges. Let a, be the number of independent
b) Is there any relationship between w(G) and B(G)? subsets of vertices in P,_,;. (The empty subset is consid-
18. If G = (V, E) is an undirected loop-free graph, the line ered one of these independent subsets.) Find and solve a
graph of G, denoted L(G), is a graph with the set E as vertices, recurrence relation for a,,.
Supplementary Exercises 579
b) Determine the number of independent subsets (of ver-
tices) in each of the graphs G,, G2, and G3, of Fig. 11.100.
c) For each of the graphs H;, Hz, and 3, of Fig. 11.101,
find the number of independent subsets of vertices.
d) Let G = (V, E) be a loop-free undirected graph with
V = {v,, v2,..., u,} and where there are m independent
subsets of vertices. The graph G’ = (V’, E’) is constructed
from G as follows: V’ = V U {x;, x2, ...,x;}, with no x,
in V, for all 1 <i <5; and E’ = EU {{x;, v;}|1 <i <s,
| < j <r}. How many subsets of V’ are independent?
1 i 1
3 2 2
5 6 3 n+ 4
(H3)
3
4 n-]
Figure 11.101
4 5
(G,) (G>) (Gs) ”
22. Suppose that G = (V, E) is a loop-free undirected graph.
Figure 11.100 If G is 5-regular and |V| = 10, prove that G is nonplanar.
12
Trees
Cumre our study of graph theory, we shall now focus on a special type of graph called
a tree. First used in 1847 by Gustav Kirchhoff (1824-1887) in his work on electrical
networks, trees were later redeveloped and named by Arthur Cayley (1821-1895). In 1857
Cayley used these special graphs in order to enumerate the different isomers of the saturated
hydrocarbons C,,H2,42,n € ZT.
With the advent of digital computers, many new applications were found for trees. Special
types of trees are prominent in the study of data structures, sorting, and coding theory, and
in the solution of certain optimization problems.
12.1
Definitions, Properties, and Examples
Definition 12.1 Let G = (V, E) be a loop-free undirected graph. The graph G is called a tree’ if G is
connected and contains no cycles.
In Fig. 12.1 the graph G, is a tree, but the graph G2 is not a tree because it contains the
cycle {a. b}, {b. c}, {c, a}. The graph G3 is not connected, so it cannot be a tree. However,
each component of G3 is a tree, and in this case we call G3 a forest.
a 5 a b ae
C c Cc
d d d
e e e
f f f
(G)) (Gp) (G3)
Figure 12.1
TAs in the case of graphs, the terminology in the study of trees is not standard and the reader may find some
differences from one textbook to another,
581
582 Chapter 12 Trees
When a graph is a tree we write T instead of G to emphasize this structure.
In Fig. 12.1 we see that G; is asubgraph of G2 where G, contains all the vertices of G2
and G, is a tree. In this situation G, is a spanning tree for G2. Hence a spanning tree for
a connected graph is a spanning subgraph that is also a tree. We may think of a spanning
tree as providing minimal connectivity for the graph and as a minimal skeletal framework
holding the vertices together. The graph G3 provides a spanning forest for the graph G2.
We now examine some properties of trees.
THEOREM 12.1 If a, b are distinct vertices in a tree T = (V, E), then there is a unique path that connects
these vertices.
Proof: Since 7 is connected, there is at least one path in 7 that connects a and b. If there
were more, then from two such paths some of the edges would form a cycle. But 7 has no
cycles.
THEOREM 12.2 If G = (V, E) is an undirected graph, then G is connected if and only if G has a spanning
tree.
Proof: If G has a spanning tree 7, then for every pair a, b of distinct vertices in V a subset of
the edges in 7 provides a (unique) path between a and b, and so G is connected. Conversely,
if G is connected and G is not a tree, remove all loops from G. If the resulting subgraph G,
is not a tree, then G; must contain a cycle C;. Remove an edge e,; from C, and let G2 =
G, — e;. If G2 contains no cycles, then G2 is a spanning tree for G because G2 contains
all the vertices in G, is loop-free, and is connected. If Gz does contain a cycle — say, Cz —
then remove an edge e> from C2 and consider the subgraph G3 = G2 — e2 = G, — {e}, e2}.
Once again, if G3 contains no cycles, then we have a spanning tree for G. Otherwise we
continue this procedure a finite number of additional times until we arrive at a spanning
subgraph of G that is loop-free and connected and contains no cycles (and, consequently,
is a spanning tree for G).
Figure 12.2 shows three nonisomorphic trees that exist for five vertices. Although they
are not isomorphic, they all have the same number of edges, namely, four. This leads us to
the following general result.
(7) (T>) (73)
Figure 12.2
THEOREM 12.3 In every tree T = (V, E),|V| =|E| +1.
Proof: The proof is obtained by applying the alternative form of the Principle of Mathe-
matical Induction to | Z|. If |£| = 0, then the tree consists of a single isolated vertex, as in
12.1 Definitions, Properties, and Examples 583
Fig. 12.3(a). Here |V| = 1 = |£| + 1. Parts (b) and (c) of the figure verify the result for the
cases where |E| = 1 or 2.
e /
(a) (b) (c)
Figure 12.3 Figure 12.4
Assume the theorem is true for every tree that contains at most & edges, where k > 0.
Now consider a tree T = (V, £), as in Fig. 12.4, where |E| = k + 1. [The dotted edge(s)
indicates that some of the tree doesn’t appear in the figure.] If, for instance, the edge with
endpoints y, zis removed from 7, we obtain two subtrees, T; = (V,, E,) and T> = (V2, E>),
where |V| = |Vi| +|V2| and |F,| +|£2| +1 = |E|. (One of these subtrees could con-
sist of just a single vertex if, for example, the edge with endpoints w, x were removed.)
Since O<|F,)|<k and O0<|E,|<k, it follows, by the induction hypothesis, that
|E;| + 1 = |V,|, fori = 1, 2. Consequently, |V| = |Vi] +|V2| = (Ai] + 1)4+ (£2 +) =
(\Ei,| + |£2) +1) + 1 = |£| + 1, and the theorem follows by the alternative form of the
Principle of Mathematical Induction.
AS we examine the trees in Fig. 12.2 we also see that each tree has at least two pendant
vertices — that 1s, vertices of degree |. This is also true in general.
THEOREM 12.4 For every tree T = (V, EF), if |V| > 2, then 7 has at least two pendant vertices.
Proof: Let |V| = n > 2. From Theorem 12.3 we know that |E| = n — 1, so by Theorem 11.2
it follows that 2(n — 1) = 2|E| = nev deg(v). Since T is connected, we have deg(v) > 1
for all v € V. If there are k pendant vertices in 7, then each of the other n — k vertices has
degree at least 2 and
2(n — 1) = 2|E| = }° deg(v) = k + 2(n —k).
veV
From this we see that [2(7 -1)>k+2(n —k)|] > [Qn — 2) > (kK +2n —2k)] >
[--2 > —k] => [k => 2], and the result is consequently established.
In Fig. 12.5 we have two trees, each with 14 vertices (labeled with C’s and H’s) and 13
EXAMPLE 12.1
edges. Each vertex has degree 4 (C, carbon atom) or degree | (H, hydrogen atom). Part (b) of
the figure has a carbon atom (C) at the center of the tree. This carbon atom is adjacent to four
vertices, three of which have degree 4. There is no vertex (C atom) in part (a) that possesses
this property, so the two trees are not isomorphic. They serve as models for the two chemical
584 Chapter 12 Trees
isomers that correspond with the saturated” hydrocarbon C4H jo. Part (a) represents n-butane
(formerly called butane); part (b) represents 2-methyl propane (formerly called isobutane).
H
|
H—C-—-—H H
| |
H—-C—H H H—-C—H H
| | | |
H—-C—H H—C C C—H
| | | |
H—C—H H H H
|
H
(a) (b)
Figure 12.5
A second result from chemistry is given in the following example.
If a saturated hydrocarbon [in particular, an acyclic (no cycles), single-bond hydrocarbon —
EXAMPLE 12.2
called an alkane| has n carbon atoms, show that it has 2n + 2 hydrogen atoms.
Considering the saturated hydrocarbon as a tree T = (V, E), let k equal the number of
pendant vertices, or hydrogen atoms, in the tree. Then with a total of n + k vertices, where
each of the n carbon atoms has degree 4, we find that
4n +k = S| deg(v) = 2|E| = 2(\V] - 1) =24+k—1),
veV
and
4n+k=2(n+k-1)
3k =2n42.
We close this section with a theorem that provides several different ways to characterize
trees.
THEOREM 12.5 The following statements are equivalent for a loop-free undirected graph G = (V, E).
a) G is a tree.
b) G is connected, but the removal of any edge from G disconnects G into two subgraphs
that are trees.
c) G contains no cycles, and |V| = |E| + 1.
d) G is connected, and |V| = |F| + 1.
The adjective saturated is used here to indicate that for the number of carbon atoms present in the molecule,
we have the maximum number of hydrogen atoms.
12.1. Definitions, Properties, and Examples 585
e) G contains no cycles, and if a,b € V with {a, b} ¢ E, then the graph obtained by
adding edge {a, b} to G has precisely one cycle.
Proof: We shall prove that (a) = (b), (b) => (c), and (c) => (d), leaving to the reader the
proofs for (d) = (e) and (e) => (a).
[(a) > (b)]: If G is a tree, then G is connected. So let e = {a, b} be any edge of G.
Then if G — e is connected, there are at least two paths in G from a to b. But this
contradicts Theorem 12.1. Hence G — e is disconnected and so the vertices in G — e
may be partitioned into two subsets: (1) vertex @ and those vertices that can be reached
from a by a path in G — e; and (2) vertex b and those vertices that can be reached from
b by a path in G — e. These two connected components are trees because a loop or cycle
in either component would also be in G.
[(b) = (c)]: If G contains a cycle, then let e = {a, b} be an edge of the cycle. But
then G — e is connected, contradicting the hypothesis in part (b). So G contains no
cycles, and since G is a loop-free connected undirected graph, we know that G is a tree.
Consequently, it follows from Theorem 12.3 that |V| = || + 1.
[(c) > (d)]: Let «(G) =r and let G;, G2, ..., G, be the components of G. For 1 <
i <r, select a vertex v; € G; and add the r — 1 edges {v1, v2}, {v2, v3}, ..., {v--1, v,}
to G to form the graph G’ = (V, EF’), which is a tree. Since G’ is a tree, we know that
|V| = |E’| + 1 because of Theorem 12.3. But from part (c), |V| = |E| + 1,so|E| = |E"|
andr — 1 = 0. With r = 1, it follows that G is connected.
b) Ifatree T = (V, E) has v2 vertices of degree 2, v3 ver-
EXERCISES 12.1 tices of degree 3,..., and v,, vertices of degree m, what
are |V| and |E|?
1. a) Draw the graphs of all nonisomorphic trees on six
9, If G = (V, E) is a loop-free undirected graph, prove that
vertices.
G is a tree if there is a unique path between any two vertices
b) How many isomers does hexane (C¢H 4) have? of G.
2. Let T= (V1, Fi), T> = (V2, E>) be two trees where
10. The connected undirected graph G = (V, E) has 30 edges.
|E,;| = 17 and |V2| = 2|V,|. Determine |Vj|, |V2|, and | E>}.
What is the maximum value that |V| can have?
3. a) Let Fi = (Vi, E,) be a forest of seven trees where
11, Let 7 = (V, E) beatree with |V| = n > 2. How many dis-
|E,| = 40. What is | V,|?
tinct paths are there (as subgraphs) in 7?
b) If Fy = (V2, E2) is a forest with | V2| = 62 and |Z] =
12. Let G =(V, E) be a loop-free connected undirected
51, how many trees determine Fy?
graph where V = {v), v2, V3, ..., Un}, > 2, deg(v,) = 1, and
4. IfG = (V, E) isa forest with |V| = v, |Z| = e,and« com- deg(v,) > 2 for 2 <i <n. Prove that G must have a cycle.
ponents (trees), what relationship exists among v, e, and «?
13. Find two nonisomorphic spanning trees for the complete
5, What kind of trees have exactly two pendant vertices? bipartite graph K2,;. How many nonisomorphic spanning trees
6. a) Verify that all trees are planar. are there for K> 3?
b) Derive Theorem 12.3 from part (a) and Euler’s Theorem 14. For n € Z*, how many nonisomorphic spanning trees are
for planar graphs. there for K>,,?
7. Give an example of an undirected graph G = (V, E) where 15. Determine the number of nonidentica] (though some may
|V| = |E| + 1 but G is nota tree. be isomorphic) spanning trees that exist for each of the graphs
shown in Fig. 12.6.
8. a) If a tree has four vertices of degree 2, one vertex of de-
gree 3, two of degree 4, and one of degree 5, how many 16. For each graph in Fig. 12.7, determine how many noniden-
pendant vertices does it have? tical (though some may be isomorphic) spanning trees exist.
586 Chapter 12 Trees
a) What is the smallest value possible for n?
b) Prove that 7 has at least m pendant vertices.
18. Suppose that T = (V, E) is a tree with |V| = 1000. What
is the sum of the degrees of all the vertices in T?
19, Let G = (V, E) bea loop-free connected undirected graph.
Let H be a subgraph of G. The complement of H in G is the
subgraph of G made up of those edges in G that are not in H
(1)
(along with the vertices incident to these edges).
a) If T is a spanning tree of G, prove that the complement
of T in G does not contain a cut-set of G.
b) If C is a cut-set of G, prove that the complement of C
in G does not contain a spanning tree of G.
20. Complete the proof of Theorem 12.5.
21. A labeled tree is one wherein the vertices are labeled. If the
tree has n vertices, then {1, 2, 3,..., 2} is used as the set of
labels. We find that two trees that are isomorphic without labels
may become nonisomorphic when labeled. In Fig. 12.8, the first
Figure 12.6
two trees are isomorphic as labeled trees. The third tree is iso-
morphic to the other two if we ignore the labels; as a labeled
tree, however, it is not isomorphic to either of the other two.
(ii)
(2)
(iii)
Figure 12.8
(3) ° The number of nonisomorphic trees with n labeled ver-
tices can be counted by setting up a one-to-one correspon-
Figure 12.7 dence between these trees and the n"~? sequences (with repe-
titions allowed) x), X2, ..., X,-2 whose entries are taken from
17. Let T = (V, E) bea tree where |V| = n. Suppose that for {1, 2,3,...,n}. If T is one such labeled tree, we use the fol-
each v € V, deg(v) = | or deg(v) > m, where m is a fixed pos- lowing algorithm to find its corresponding sequence — called
itive integer and m > 2. the Priifer code for the tree. (Here T has at least one edge.)
12.2 Rooted Trees 587
Step 1: Set the counteri to lL. 23. Characterize the trees whose Priifer codes
Step 2: Set 7) = T. a) contain only one integer, or
Step 3: Since a tree has at least two pendant vertices, select b) have distinct integers in all positions.
the pendant vertex in 7 (i) with the smallest label y,. Now 24. Show that the number of labeled trees with n vertices, k
remove the edge {x,, y,} from 7 (i) and use x, for the ith of which are pendant vertices, is ({)(n — k)!S(n — 2, n — k) =
component of the sequence. (n!/k!)S(n —2,n —k), where S(n —2,n—k) is a Stirling
Step 4: If i =n — 2, we have the sequence corresponding number of the second kind. (This result was first established
to the given labeled tree 7 (1). Ifi A n — 2, increase i by 1, in 1959 by A. Rényi.)
set T (i) equal to the resulting subtree obtained in step (3), 25. Let G = (V, E) be the undirected graph in Fig. 12.9. Show
and return to step (3). that the edge set & can be partitioned as E, U E> so that the sub-
a) Find the six-digit sequence (Priifer code) for trees (i) graphs G; = (V, E,), G2 = (V, E>) are isomorphic spanning
and (iii) in Fig. 12.8. trees of G.
b) If v is a vertex in 7, show that the number of times the
label on v appears in the Priifer code x), x2, ..., X,—2 iS
deg(v) — 1.
c) Reconstruct the labeled tree on eight vertices that is as-
sociated with the Priifer code 2, 6, 5, 5,5, 5.
d) Develop an algorithm for reconstructing a tree from a
given Priifer code x), x2, ..., Xn—2+
22. Letn € Z*, n > 3. If v is a vertex in K,, how many of the
n"~* spanning trees of K,, have v as a pendant vertex? Figure 12.9
12.2
Rooted Trees
We turn now to directed trees. We find a variety of applications for a special type of directed
tree called a rooted tree.
Definition 12.2 If G is a directed graph, then G is called a directed tree if the undirected graph associated
with G is a tree. When G is a directed tree, G is called a rooted tree if there is a unique
vertex r, called the root, in G with the in degree of r = id(r) = 0, and for all other vertices
v, the in degree of v = id(v) = 1.
The tree in part (a) of Fig. 12.10 is directed but not rooted; the tree in part (b) is rooted
with root r.
(a)
Figure 12.10
588 Chapter 12 Trees
We draw rooted trees as in Fig. 12.10(b) but with the directions understood as going
from the upper level to the lower level, so that the arrows aren’t needed. In a rooted tree,
a vertex with out degree 0 is called a leaf (or terminal vertex.) Vertices u, v, x, y, z are
leaves in Fig. 12.10(b). All other vertices are called branch nodes (or internal vertices).
Consider the vertex s in this rooted tree [Fig. 12.10(b)]. The path from the root, r, to s is
of length 2, so we say that s is at /Jevel 2 in the tree, or that s has level number 2. Similarly, x
is at level 3, whereas y has level number 4. We call s a child of n, and we call n the parent
of s. Vertices w, y, and z are considered descendants of s, n, and r, while s,m, and r are
called ancestors of w, y, and z. In general, if v; and v2 are vertices in a rooted tree and v;
has the smaller level number, then v; is an ancestor of v2 (or v2 is a descendant of v_) if
there is a (directed) path from v; to v2. Two vertices with a common parent are referred to
as siblings. Such is the case for vertices g and s, whose common parent is vertex . Finally,
if v; is any vertex of the tree, the subtree at v, is the subgraph induced by the root v; and
all of its descendants (there may be nene).
In Fig. 12.11(a) a rooted tree is used to represent the table of contents of a three-chapter
EXAMPLE 12.3
(C1, C2, C3) book. Vertices with level number 2 are for sections within a chapter; those
at level 3 represent subsections within a section. Part (b) of the figure displays the natural
order for the table of contents of this book.
Book Book
C1
J \ $1.1
C1 C2 C3 91.2
/\ /\\ |e
C2
$3.1
$1.1 $1.2 $3.1 $3.2 $3.3
$3.2
$3.2.1
$3.2.2
$3.2.1 $3.2.2 53.3
(a) (b)
Figure 12.11
The tree in Fig. 12.11(a) suggests an order for the vertices if we examine the subtrees
at Cl, C2, and C3 from left to right. (This order will recur again in this section, in a more
general context.) We now consider a second example that provides such an order.
In the tree T shown in Fig. 12.12, the edges (or branches, as they are often called) leaving
EXAMPLE 12.4
each internal vertex are ordered from left to right. Hence T is called an ordered rocted tree.
12.2 Rooted Trees 589
1.2.3.1 1.2.3.2
Figure 12.12
We label the vertices for this tree by the following algorithm.
Step 1: First assign the root the label (or address) 0.
Step 2; Next assign the positive integers 1, 2, 3, . . . to the vertices at level 1, going~
from left to right,
Step 3: Now let pv be an internal vertex at level n > 1, and let uy, v2, ..., 0, denote
the children of » (going from left to right), If a is the label assigned to vertex v,
assign the labels a.1, a.2,..., @.% to the children v;, v2, ..., vy, respectively.
Consequently, each vertex in 7, other than the root, has a label of the form
Q).A2.04..... a, if and only if that vertex has level number n. This is known as the universal
address system.
This system provides a way to order all vertices in 7. If u and v are two vertices
in T with addresses b and c, respectively, we define b < c if (a) b= ay.a..... Gy and
=).do..... Am Amt... es an, With m <n; or (b) b=ay.a2..... Am X] oes. y and
C=a,.d>..... On XQ vv z, where x}, x2 € Zt and x) <x.
For the tree under consideration, this ordering yields
0 1.2 1.2.3 1.3 p> > 3
[ 1.2.1 1.2.3.1 1.4 : 2.2 | 3.1
1.1— 1.2.2— 1.2.3.2— 2 — _— 2.2,1— 3.2
Since this resembles the alphabetical ordering in a dictionary, the order is called the /exi-
cographic, or dictionary, order.
We now consider an application of a rooted tree in the study of computer science.
a) A rooted tree is a binary rooted tree if for each vertex v, od(v) = 0, 1, or 2 that
— is,
EXAMPLE 12.5
if v has at most two children. If od{v) = 0 or 2 for all v € V, then the rooted tree is
called a complete binary tree. Such a tree can represent a binary operation, as in parts
590 Chapter 12 Trees
(a) and (b) of Fig. 12.13. To avoid confusion when dealing with a noncommutative
operation o, we label the root as o and require the result to be a o b, where a is the left
child, and 6 the right child, of the root.
+a(a+b) -g(a-—b)
a b a b
(a) (b)
Figure 12.13
b) In Fig. 12.14 we extend the ideas presented in Fig. 12.13 in order to construct the
binary rooted tree for the algebraic expression
(7 — a)/5) * ((a + b) F 3),
7 a
(a)
/
- 5
7 a a b
(b) (d) (e)
Figure 12.14
where *« denotes multiplication and t denotes exponentiation. Here we construct this
tree, as shown in part (e) of the figure, from the bottom up. First, a subtree for the
expression 7 — a is constructed in part (a) of Fig. 12.14. This is then incorporated (as
the left subtree for /) in the binary rooted tree for the expression (7 — a)/5 in Fig.
12.14 (b). Then, ina similar way, the binary rooted trees in parts (c) and (d) of the figure
are constructed for the expressions a + 6 and (a + b) + 3, respectively. Finally, the
two subtrees in parts (b) and (d) are used as the left and right subtrees, respectively, for
* and give us the binary rooted tree [in Fig. 12.14(e)] for (7 — a)/5) * ((a + b) t 3).
The same ideas are used in Fig. 12.15, where we find the binary rooted trees for the
algebraic expressions
(a — (3/b)) +5 [in part (a)] and a — (3/(b + 5)) [in part (b)].
c) In evaluating ¢ + (uv)/(w +x — y*) in certain procedural languages, we write the
expression in the form f + (“4 *v)/(w +x — y tz). When the computer evaluates
this expression, it performs the binary operations (within each parenthesized part)
according to a hierarchy of operations whereby exponentiation precedes multiplication
12.2 Rooted Trees 591
y Tz)
Ill| | |
t+ (u*v}/(w X
© O® ® QM
Figure 12.15 Figure 12.16
and division, which in turn precede addition and subtraction. In Fig. 12.16 we number
the operations in the order in which they are performed by the computer. For the
computer to evaluate this expression, it must somehow scan the expression in order
to perform the operations in the order specified.
Instead of scanning back and forth continuously, however, the machine converts
the expression into a notation that is independent of parentheses. This is known as
Polish notation, in honor of the Polish (actually Ukrainian) logician Jan Lukasiewicz
(1878-1956). Here the infix notation a o b for a binary operation o becomes o ab, the
prefix (or Polish) notation. The advantage is that the expression in Fig. 12.16 can be
rewritten without parentheses as
+ft/*xuv+w—x
ft yz,
where the evaluation proceeds from right to left. When a binary operation is encoun-
tered, it is performed on the two operands to its right. The result is then treated as one
of the operands for the next binary operation encountered as we continue to the left.
For instance, given the assignments f = 4,u = 2,v=3,w=1,x =9,y =2,z =3,
the following steps take place in the evaluation of the expression
+t/*xuv+w-—x
ft yz.
1)4+4/*234+1-9%f23
——_—
273=8
2)+4/%*234+1-98
——"
9—-8=1
3) +4/*23411
—
1+1=
4,+4/*23 2
2* 3=6
5)+4 /62
——
6/2=3
6) ——
+4 3
44+3=7
So the value of the given expression for the preceding assignments is 7.
The use of Polish notation is important for the compilation of computer programs and
can be obtained by representing a given expression by a rooted tree, as shown in Fig. 12.17.
Here each variable (or constant) is used to label a leaf of the tree. Each internal vertex is
592 Chapter 12 Trees
labeled by a binary operation whose left and right operands are the left and right subtrees
it determines. Starting at the root, as we transverse the tree from top to bottom and left to
right, as shown in Fig. 12.17, we find the Polish notation by writing down the labels of the
vertices in the order in which they are visited.
Figure 12.17
The last two examples illustrate the importance of order. Several methods exist for
systematically ordering the vertices in a tree. Two of the most prevalent in the study of data
structures are the preorder and postorder. These are defined recursively in the following
definition.
Definition 12.3 Let T = (V, E) bea rooted tree with root r. If T has no other vertices, then the root by itself
constitutes the preorder and postorder traversals of T. If |V| > 1, let 11, To. 73, .... T
denote the subtrees of T as we go from left to right (as in Fig. 12.18).
Ty Ty 73 Tg
Figure 12.18
a) The preorder traversal of T first visits r and then traverses the vertices of 7) in
preorder, then the vertices of 72 in preorder, and so on until the vertices of 7; are
traversed in preorder.
b) The postorder traversal of T traverses in postorder the vertices of the subtrees 7,
To, ..., T, and then visits the root.
12.2 Rooted Trees 593
We demonstrate these ideas in the following example.
Consider the rooted tree shown in Fig. 12.19.
EXAMPLE 12.6
11 12 13 14 15 16 17
Figure 12.19
a) Preorder: After visiting vertex 1 we visit the subtree 7, rooted at vertex 2. After
visiting vertex 2 we proceed to the subtree rooted at vertex 5, and after visiting vertex
5 we go to the subtree rooted at vertex 11. This subtree has no other vertices, so we
visit vertex 11 and then return to vertex 5 from which we visit, in succession, vertices
12, 13, and 14. Following this we backtrack (14 to 5 to 2 to 1) to the root and then
visit the vertices in the subtree 7) in the preorder 3, 6, 7. Finally, after returning to the
root for the last time, we traverse the subtree 73 in the preorder 4, 8, 9, 10, 15, 16, 17.
Hence the preorder listing of the vertices in this tree is 1, 2, 5, 11, 12, 13, 14, 3, 6, 7,
4, 8,9, 10, 15, 16, 17.
In this ordering we start at the root and build a path as far as we can. At each level
we go to the leftmost vertex (not previously visited) at the next level, until we reach a
leaf £. Then we backtrack to the parent p of this leaf ¢ and visit £’s sibling s (and the
subtree that s determines) directly on its right. If no such sibling s exists, we backtrack
to the grandparent g of the leaf @ and visit, if it exists, a vertex u that is the sibling of
p directly to its right in the tree. Continuing in this manner, we eventually visit (the
first time each one is encountered) all of the vertices in the tree.
The vertices in Figs. 12.11(a), 12.12, and 12.17 are visited in preorder. The preorder
traversal for the tree in Fig. 12.11(a) provides the ordering in Fig. 12.11(b). The
lexicographic order in Example 12.4 arises from the preorder traversal of the tree in
Fig. 12.12.
b) Postorder: For the postorder traversal of a tree, we start at the root r and build the
longest path, going to the leftmost child of each internal vertex whenever we can. When
we alrive at a leaf € we visit this vertex and then backtrack to its parent p. However,
we do not visit p until after all of its descendants are visited. The next vertex we visit is
found by applying the same procedure at p that was originally applied at r in obtaining
£ — except that now we first go from p to the sibling of @ directly to the right (of £).
And at no time is any vertex visited more than once or before any of its descendants.
For the tree given in Fig. 12.19, the postorder traversal starts with a postorder
traversal of the subtree 7; rooted at vertex 2. This yields the listing 11, 12, 13, 14, 5,
2. We proceed to the subtree 7>, and the postorder listing continues with 6, 7, 3. Then
for T; we find 8, 9, 15, 16, 17, 10, 4 as the postorder listing. Finally, vertex 1 is visited.
Consequently, for this tree, the postorder traversal visits the vertices in the order 11,
12, 13, 14, 5, 2, 6, 7, 3, 8, 9, 15, 16, 17, 10, 4, 1.
594 Chapter 12 Trees
In the case of binary rooted trees, a third type of tree traversal called the inorder traversal
may be used. Here we do not consider subtrees as first and second, but rather in terms of
left and right. The formal definition is recursive, as were the definitions of preorder and
postorder traversals.
Definition 12.4 Let T = (V, E) be a binary rooted tree with vertex r the root.
1) If |V| = 1, then the vertex r constitutes the inorder traversal of T.
2) When |V| > 1, let 7, and 7g denote the left and right subtrees of T. The inorder
traversal of T first traverses the vertices 7;, in inorder, then it visits the root r, and
then it traverses, in inorder, the vertices of Tp.
We realize that here a left or right subtree may be empty. Also, if v is a vertex in sucha
tree and od{v) = 1, then if w is the child of v, we must distinguish between w’s being the
left child and its being the right child.
As a result of the previous comments, the two binary rooted trees shown in Fig. 12.20
EXAMPLE 12.7
are not considered the same, when viewed as ordered trees. As rooted binary trees they
are the same. (Each tree has the same set of vertices and the same set of directed edges.)
However, when we consider the additional concept of left and right children, we see that
in part (a) of the figure vertex v has right child a, whereas in part (b) vertex a is the left
child of v. Consequently, when the difference between left and right children is taken into
consideration, these trees are no longer viewed as the same tree.
(a) (b)
Figure 12.20
In visiting the vertices for the tree in part (a) of Fig. 12.20, we first visit in inorder the
left subtree of the root r. This subtree consists of the root v and its right child a. (Here the
left child is null, or nonexistent.) Since v has no left subtree, we visit in inorder vertex v
and then its right subtree, namely, a. Having traversed the left subtree of r, we now visit
vertex r and then traverse, in inorder, the vertices in the right subtree of r. This results in our
visiting first vertex b (because 6 has no left subtree) and then vertex c. Hence the inorder
listing for the tree shown in Fig. 12.20(a) is v, a, r, b,c.
When we consider the tree in part (b) of the figure, once again we start by visiting, in
inorder, the vertices in the left subtree of the root r. Here, however, this left subtree consists
of vertex v (the root of the subtree) and its /eft child a. (In this case, the right child of v is
null, or nonexistent.) Therefore this inorder traversal first visits vertex a (the left subtree
of v), and then vertex v. Since v has no right subtree, we are now finished visiting the left
subtree of r, in inorder. So next the root r is visited, and then the vertices of the right subtree
12.2 Rooted Trees 595
of r are traversed, in inorder. This results in the inorder listing a, v, r, b, c for the tree shown
in Fig. 12.20(b).
We should note, however, that for the preorder traversal in this particular example,’ the
same result is obtained for both trees:
Preorder listing: r, v,a, b,c.
Likewise, this particular example is such that the postorder traversal for either tree gives us
the following:
Postorder listing: a, v, c, b, r.
It is only for the inorder traversal, with its distinctions between left and right children and
between left and right subtrees, that a difference occurs. For the trees in parts (a) and (b) of
Fig. 12.20 we found the respective inorder listings to be
(a)v,a,r, b,c and (b) a. v, r, b,c.
If we apply the inorder traversal to the binary rooted tree shown in Fig. 12.21, we find that
EXAMPLE 12.8
the inorder listing for the vertices is p, j,g, f,c,k, g,a,d,r,b,h, s,m, e,i,t,n, u.
p gq 5 t u
Figure 12.21
Our next example shows how the preorder traversal can be used in a counting problem
dealing with binary trees.
For n > 0, consider the complete binary trees on 2n + 1 vertices. The cases for 0 <n <3
EXAMPLE 12.9% are shown in Fig. 12.22. Here we distinguish left from right. So, for example, the two
*A note of caution! If we interchange the order of the two existing children (of a certain parent) in a binary
rooted tree, then a change results in the preorder, postorder, and inorder traversals. If one child is “null,” however,
then only the inorder traversal changes.
+ This example uses material developed in the optional Sections 1.5 and 10.5. It may be omitted with no loss
of continuity.
596 Chapter 12 Trees
complete binary trees for n = 2 are considered distinct. [If we do not distinguish left from
right, these trees are (isomorphic and) no longer counted as two different trees.|
(n = 0) (n = 1) (n
= 2)
r r
er r
/\ a dD a b
a b
C d c d
r r,a,b r,a,b,c,d r,a,c,d,b
L,R L,R,L, R L,L, R,R
(n = 3)
r,a, C, a, b, e, f r,a,c,¢,fda,b r,a, c,d,e, f,b r,a, b,c, a, e, f r,a,b,c,e, fd
L,L, R, R, L,R L,L, L, R, R,R LL, R,L, Rk,R L, R, L, R,
b, R L,R, L, L, R,R
Figure 12.22
Below each tree in the figure we list the vertices for a preorder traversal. In addition, for
1 <n <3, we find a list of n L’s and n R’s under each preorder traversal. These lists are
determined as follows. The first tree for n = 2, for instance, has the list L, R, L, R because,
after visiting the root r, we go to the left (L) subtree rooted at a and visit vertex a. Then we
backtrack to r and go to the right (R) subtree rooted at b. After visiting vertex b we go to
the left (L) subtree of b rooted at c and visit vertex c. Then, lastly, we backtrack to b and
go to its right (R) subtree to visit vertex d. This generates the list L, R, L, R and the other
seven lists of L’s and R’s are obtained in the same way.
Since we are traversing these trees in preorder, each list starts with an L. There is an
equal number of L’s and R’s in each list because the trees are complete binary trees. Finally,
the number of R’s never exceeds the number of L’s as a given list is read from left to right —
again, because we have a preorder traversal. Should we replace each L by a 1 and each R
by a —1, for the five trees for n = 3, we find ourselves back in part (a) of Example 1.43,
where we have one of our early examples of the Catalan numbers. Hence, for n > 0, we
see that the number of complete binary trees on 2n + | vertices is — (2"), the nth Catalan
number. [Note that if we prune the five trees for n = 3 by removing the four leaves for each
tree, we obtain the five rooted ordered binary trees in Fig. 10.18.]
The notion of preorder now arises in the following procedure for finding a spanning tree
for a connected graph.
Let G = (V, E) be a loop-free connected undirected graph with r € V. Starting from r,
we construct a path in G that is as long as possible. If this path includes every vertex in V,
then the path is a spanning tree 7 for G and we are finished. If not, let x and y be the last
two vertices visited along this path, with y the last vertex. We then return, or backtrack, to
the vertex x and construct a second path in G that is as long as possible, starts at x, and
12.2 Rooted Trees 597
doesn’t include any vertex already visited. If no such path exists, backtrack to the parent
p of x and see how far it is possible to branch off from p, building a path (that is as long
as possible and has no previously visited vertices) to a new vertex y, (which will be a
new leaf for 7). Should all edges from the vertex p lead to vertices already encountered,
backtrack one level higher and continue the process. Since the graph is finite and connected,
this technique, which is called backtracking, or depth-first search, eventually determines a
spanning tree T for G, where r is regarded as the root of 7. Using 7, we then order the
vertices of G in a preorder listing.
The depth-first search serves as a framework around which many algorithms can be
designed to test for certain graph properties. One such algorithm will be examined in detail
in Section 12.5.
One way to help implement the depth-first search in a computer program is to assign a
fixed order to the vertices of the given graph G = (V, E). Then if there are two or more
vertices adjacent to a vertex v and none of these vertices has already been visited, we shall
know exactly which vertex to visit first. This order now helps us to develop the foregoing
description of the depth-first search as an algorithm.
Let G = (V, E) be a loop-free connected undirected graph where |V| = n and the ver-
tices are ordered as v), V2, U3, ..., U,. To find the rooted ordered depth-first spanning tree
for the prescribed order, we apply the following algorithm, wherein the variable v is used
to store the vertex presently being examined.
Depth-First Search Algorithm
Step 1: Assign uv; to the variable v and initialize T as the tree consisting of just
this one vertex. (The vertex v; will be the root of the spanning tree that develops.)
Visit v4.
Step 2: Select the smallest subscript i, for 2 <i <n, such that (v, oj} € E and v;
has not already been visited.
If no such subscript is found, then go to step (3). Otherwise, perform the follow-
ing: (1) Attach the edge {v, v;} to the tree T and visit v;; (2) Assign v; to v; and
(3) Return to step (2).
Step 3: If v = v;, the tree T is the (rooted ordered) spanning tree for the order
specified.
Step 4: For v # v,, backtrack from v to its parent wu in 7, Then assign u to v and
return to step (2).
We now apply this algorithm to the graph G = (V, E) shown in Fig. 12.23(a). Here the
EXAMPLE 12.10
order for the vertices is alphabetic: a, b,c, d,e, f, g, A, i, j.
First we assign the vertex a to the variable v and initialize T as just the vertex a (the
root). We visit vertex a. Then, going to step (2), we find that the vertex b is the first vertex
w such that {a, w} € EF and w has not been visited earlier. So we attach edge {a, b} to T
and visit b, assign } to v, and then return to step (2).
At v = b we find that the first vertex (not visited earlier) that provides an edge for the
spanning tree is d. Consequently, the edge {b, d} is attached to T and d is visited, then d is
assigned to v, and we again return to step (2).
598 Chapter 12 Trees
(a) G =(V, £)
Figure 12.23
This time, however, there is no new vertex that we can obtain from d, because vertices
a and b have already been visited. So we go to step (3). But here the value of v is d, not a,
and we go to step (4). Now we backtrack from d, assigning the vertex b to v, and then we
return to step (2). At this time we add the edge {b, e} to T and visit e.
Continuing the process, we attach the edge {e, f} (and visit f) and then the edge {e, h}
(and visit h). But now the vertex h has been assigned to v, and we must backtrack from
h to e to b to a. When v is assigned the vertex a this (second) time, the new edge {a, c}
is obtained and vertex c is visited. Then we proceed to attach the edges {c, g}, {g, i}, and
{g, J} (visiting the vertices g, i, and j, respectively). At this point all of the vertices in G
have been visited, and we backtrack from / to g toc toa. With v = a once again we return
to step (2) and from there to step (3), where the process terminates.
The resulting tree 7 = (V, £;) is shown in part (b) of Fig. 12.23. Part (c) of the figure
shows the tree 7’ that results for the vertex ordering: j,i, h, g, f, e, d,c, b, a.
A second method for searching the vertices of a loop-free connected undirected graph is
the breadth-first search. Here we designate one vertex as the root and fan out to all vertices
adjacent to the root. From each child of the root we then fan out to those vertices (not
previously visited) that are adjacent to one of these children. As we continue this process,
we never list a vertex twice, so no cycle is constructed, and with G finite the process
eventually terminates.
We actually used this technique earlier in Example 11.28 of Section 11.5.
Acertain data structure proves useful in developing an algorithm for this second searching
method. A queue is an ordered list wherein items are inserted at one end (called the rear) of
the list and deleted at the other end (called the front). The first item inserted in the queue is
the first item that can be taken out of it. Consequently, a queue is referred to as a “first-in,
first-out,” or FIFO, structure.
As in the depth-first search, we again assign an order to the vertices of our graph.
We start with a loop-free connected undirected graph G = (V, FE), where |V| = n and
the vertices are ordered as v), V2, V3, ..., U,. The following algorithm generates the (rooted
ordered) breadth-first spanning tree T of G for the given order.
Breadth-First Search Algorithm
Step 1: Insert vertex v; at the rear of the (initially empty) queue Q and initialize T
as the tree made up of this one vertex v; (the root of the final version of T). Visit vj.
12.2 Rooted Trees 599
Step 2: While the queue Q is not empty, delete the vertex v from the front of Q.
Now examine the vertices vu; (for 2 <i <n) that are adjacent to v-— in the specified
order. If v; has not been visited, perform the following: (1) Insert uv; at the rear of
Q; (2) Attach the edge {v, v;} to 7; and (3) Visit vertex v;. [If we examine all of
the vertices previously in the queue Q and obtain no new edges, then the tree T
(generated to this point) is the (rooted ordered) spanning tree for the given order.]
We shall employ the graph of Fig. 12.23(a) with the prescribed order a, b, c, d, e, f, g, h,
EXAMPLE 12.11
i, j to illustrate the use of the algorithm for the breadth-first search.
Start with vertex a. Insert a at the rear of (the presently empty) queue Q, initialize T as
this one vertex (the root of the resulting tree), and visit vertex a.
In step (2) we now delete a from (the front of) Q and examine the vertices adjacent to
a—namely, the vertices b, c, d. (These vertices have not been previously visited.) This
results in our (i) inserting vertex b at the rear of Q, attaching the edge {a, b} to 7, and
visiting vertex b; (ii) inserting vertex c at the rear of Q (after b), attaching the edge {a. c}
to 7, and visiting vertex c; and (ili) inserting vertex d at the rear of Q (after c), attaching
the edge {a, d} to 7, and visiting vertex d.
Since the queue Q is not empty, we execute step (2) again. Upon deleting vertex b from
the front of Q, we now find that the only vertex adjacent to b (that has not been previously
visited) is e. So we insert vertex e at the rear of Q (after d), attach the edge {b, e} to T,
and visit vertex e. Continuing with vertex c we obtain the new (unvisited) vertex g. So we
insert vertex g at the rear of Q (after ¢), attach the edge {c, g} to 7, and visit vertex g.
And now we delete vertex d from the front of Q. But at this point there are no unvisited
vertices adjacent to d, so we then delete vertex e from the front of Q. This vertex leads
to the following: inserting vertex f at the rear of Q (after g), attaching the edge {e, f} to
T, and visiting vertex f. This is followed by: inserting vertex h at the rear of Q (after f),
attaching edge {e, h} to T, and visiting vertex #. Continuing with vertex g, we insert vertex
i at the rear of Q (after 4), attach edge {g, i} to T, and visit vertex i, and then we insert
vertex j at the rear of Q (after i), attach edge {g, j} to T, and visit vertex j.
Once again we return to the beginning of step (2). But now when we delete (from the
front of Q) and examine each of the vertices f, h, i, and j (in this order), we find no
unvisited vertices for any of these four vertices. Consequently, the queue Q now remains
empty and the tree T in Fig. 12.24(a) is the breadth-first spanning tree for G, for the order
Figure 12.24
600 Chapter 12 Trees
prescribed. (The tree 7), shown in part (b) of the figure, arises for the order j,7,h, 2, f, e,
d,c, b, a.)
Let us apply these ideas on graph searching to one more example.
Let G =(V, £) be an undirected graph (with loops) where the vertices are ordered as
EXAMPLE 12.12 v1, V2,..., U7. If Fig. 12.25(a) is the adjacency matrix A(G) for G, how can we use this
representation of G to determine whether G is connected, without drawing the graph?
Vy Vy
V1 V2 V3 V4 V5 Vg V7 V2 V7 V2
Vy 0100001
vy} 1111000
3}/0110000 v3 ¢ “a "38 “4
AG)= %44}/ 0100101
v4} 0001010 Vs Vs v7
v6} 000010 0
v7} 1001000
V6 YE
Breadth-first Depth-first
search search
(a) (b) (c)
Figure 12.25
Using v, as the root, in part (b) of the figure we search the graph by means of its adjacency
matrix, using a breadth-first search. [Here we ignore the loops by ignoring any 1|’s on the
main diagonal (extending from the upper left to the lower right).] First we visit the vertices
adjacent to v1, listing them in ascending order according to the subscripts on the v’s in A(G).
The search continues, and as all vertices in G are reached, G is shown to be connected.
The same conclusion follows from the depth-first search in part (c). The tree here also
has v, as its root. As the tree branches out to search the graph, it does so by listing the first
vertex found adjacent to v; according to the row in A(G) for v,. Likewise, from v2 the
first new vertex in this search is found from A(G) to be v3. The vertex v3 is a leaf in this
tree because no new vertex can be visited from v3. As we backtrack to v2, row 2 of A(G)
indicates that v4 can now be visited from v2. As this process continues, the connectedness
of G follows from part (c) of the figure.
It is time now to return to our main discussion on rooted trees. The following definition
generalizes the ideas that were introduced for Example 12.5.
Definition 12.5 Let T = (V, E) be a rooted tree, and let m € Z*.
We call T an m-ary tree if od(v) < m for all v € V. When m = 2, the tree is called a
binary tree.
If od(v) = O orm, for all v € V, then T is called a complete m-ary tree. The special case
of m = 2 results in a complete binary tree.
12.2 Rooted Trees 601
In a complete m-ary tree, each internal vertex has exactly m children. (Each leaf of this
tree still has no children.)
Some properties of these trees are considered in the following theorem.
THEOREM 12.6 Let T = (V, E) be a complete m-ary tree with |V| =n. If T has @ leaves and i inter-
nal vertices, then (a) n = mi + 1; (b) €= (m—1)i +1; and (©) i =(-1)/mM—-)D=
(n — 1)/m.
Proof: This proof is left for the Section Exercises.
| EXAMPLE 12.13 The Wimbledon tennis championship is a single-elimination tournament wherein a player
(or doubles team) is eliminated after a single loss. If 27 women compete in the singles
championship, how many matches must be played to determine the number-one female
player?
Consider the tree shown in Fig. 12.26. With 27 women competing, there are 27 leaves in
this complete binary tree, so from Theorem 12.6(c) the number of internal vertices (which
is the number of matches) isi = (€ — 1)/(m — 1) = (27 — 1)/(2— 1) = 26.
The
champion
The
semifinals
The
quarterfinals
Figure 12.26
A classroom contains 25 microcomputers that must be connected to a wall socket that has
EXAMPLE 12.14
four outlets. Connections are made by using extension cords that have four outlets each.
What is the least number of cords needed to get these computers set up for class use?
The wall socket is considered the root of a complete m-ary tree for m = 4. The micro-
computers are the leaves of this tree, so £ = 25. Each internal vertex, except the root, corre-
sponds with an extension cord. So by part (c) of Theorem 12.6, there are (€ — 1)/(m — 1) =
(25 — 1)/(4 — 1) = 8 internal vertices. Hence we need 8 — 1 (where the | is subtracted for
the root) = 7 extension cords.
Definition 12.6 If T = (V, E) is a rooted tree and h is the largest level number achieved by a leaf of T,
then T is said to have height h. A rooted tree T of height h is said to be balanced if the
level number of every leaf in T is h — 1 orh.
602 Chapter 12 Trees
The rooted tree shown in Fig. 12.19 is a balanced tree of height 3. Tree 7’ in Fig. 12.23(c)
has height 7 but is not balanced. (Why?)
The tree for the tournament in Example 12.13 must be balanced so that the tournament
will be as fair as possible. If it is not balanced, some competitor will receive more than one
bye (an opportunity to advance without playing a match).
Before stating our next theorem, let us recall that for all x € R, |x| denotes the greatest
integer in x, or floor of x, whereas [x] designates the ceiling of x.
THEOREM 12.7 Let T =(V, E) be a complete m-ary tree of height fA with & leaves. Then £ < m" and
h > [log,, €].
Proof: The proof that £ < m" will be established by induction on h. When h = 1, T is a tree
with a root and m children. In this case 2 = m = m", and the result is true. Assume the result
true for all trees of height < h, and consider a tree T with height / and £ leaves. (The level
numbers that are possible for these leaves are 1, 2,..., 4, with at least m of the leaves at
level h.) The € leaves of T are also the £ leaves (total) for the m subtrees 7;, 1 <i < m, of
T rooted at each of the children of the root, For 1 <i < m, let £; be the number of leaves in
subtree 7;. (In the case where leaf and root coincide, £; = 1. But sincem > 1 andh — 1 > 0,
we have m"~! > 1 = £;.) By the induction hypothesis, €; < m'@) < m'~!, where h(T;)
denotes the height of the subtree 7;, and so = €; + £2 +--++ £m <m(m"~!) =m",
With £ <m", we find that log,, £ <log,,(m") = h, and since h € Z*, it follows that
h > [log,, £].
COROLLARY 12.1 Let T be a balanced complete m-ary tree with ¢ leaves. Then the height of T is [log,, €].
Proof: This proof is left as an exercise.
We close this section with an application that uses a complete ternary (mm = 3) tree.
Decision Trees. There are eight coins (identical in appearance) and a pan balance. If exactly
EXAMPLE 12.15
one of these coins is counterfeit and heavier than the other seven, find the counterfeit coin.
Let the coins be labeled 1, 2, 3, ..., 8. In using the pan balance to compare sets of coins
there are three outcomes to consider: (a) the two sides balance to indicate that the coins in
the two pans are not counterfeit; (b) the left pan of the balance goes down, indicating that
the counterfeit coin is in the left pan; or (c) the right pan goes down, indicating that it holds
the counterfeit coin.
In Fig. 12.27(a), we search for the counterfeit coin by first balancing coins 1, 2, 3, 4
against 5, 6, 7, 8. If the balance tips to the right, we follow the right branch from the root to
then analyze coins 5, 6 against 7, 8. If the balance tips to the left, we test coins 1, 2 against
3, 4. At each successive level, we have half as many coins to test, so at level 3 (after three
weighings) the heavier counterfeit coin has been identified.
The tree in part (b) of the figure finds the heavier coin in two weighings. The first weighing
balances coins 1, 2, 3 against 6, 7, 8. Three possible outcomes can occur: (i) the balance tips
to the right, indicating that the heavier coin is 6, 7, or 8, and we follow the right branch from
the root; (i1) the balance tips to the left and we follow the left branch to find which of 1, 2,
3 is the heavier; or (iii) the pans balance and we follow the center branch to find which of
4, 5 is heavier. At each internal vertex the label indicates which coins are being compared.
12.2 Rooted Trees 603
11,2, 3,4'-°5, 6, 7, 8 11, 2, 3}—:6, 7, 8]
‘Tt {2} {3h 145) 6
Binary decision tree Ternary decision tree
(a) (Height = 3) (b} (Height = 2)
Figure 12.27
Unlike part (a), a conclusion may be deduced in part (b) when a coin is not included in a
weighing. Finally, when comparing coins 4 and 5, because equality cannot take place we
label the center leaf with ¥.
In this particular problem, we claim that the height of the complete ternary tree used must
be at least 2. With eight coins involved, the tree will have at least eight leaves. Consequently,
with £ > 8, it follows from Theorem 12.7 that h > [log, €] > [log, 8] = 2, so at least two
weighing are needed. If n coins are involved, the complete ternary tree will have £ leaves
where £ > n, and its height # satisfies h > [log, n].
f) What is the level number of vertex f?
g) Which vertices have level number 4?
1. Answer the following questions for the tree shown in 2. Let T = (V, E) bea binary tree. In Fig. 12.29 we find the
Fig. 12.28. subtree of T rooted at vertex p. (The dashed line coming into
vertex p indicates that there is more to the tree 7 than what
appears in the figure.) If the level number for vertex u is 37,
(a) what are the level numbers for vertices p, 5, f, v, w, xX, y,
and z? (b) how many ancestors does vertex u have? (c) how
many ancestors does vertex y have?
k pqs t
Figure 12.28
a) Which vertices are the leaves?
b) Which vertex is the root?
c) Which vertex is the parent of g? Figure 12.29
d) Which vertices are the descendants of c?
. a) Write the expression (w +x — y)/(a *z3) in Polish
e) Which vertices are the siblings of s? notation, using a rooted tree.
604 Chapter 12 Trees
b) What is the value of the expression (in Polish notation) Vy V2 V3 Vga Vs Ve V7 Vy
/ta-—bce+dxef,ifa=c=d=e=2,b=f=4? vz,/ O 1 0 0 0 0 1 989
vo} 1 1 0 1 1 0 1 ~0
4. Let T = (V, E) be a rooted tree ordered by a universal ad-
w3/ 0 0 0 1 0 1 0 «41
dress system. (a) If vertex v in T has address 2.1.3.6, what is the
vu} O 1 1 0 0 0 0 0
smallest number of siblings that v must have? (b) For the vertex
vy] O 1 0 0 0 0 1 +9
v in part (a), find the address of its parent. (c) How many an-
vu} 0 0 1 0 0 1 0 0
cestors does the vertex v in part (a) have? (d) With the presence
vw} | 10 0 1 0 0 0
of v in 7, what other addresses must there be in the system?
vy 0 O 100 0 0 0
5. For the tree shown in Fig. 12.30, list the vertices accord-
Use a breadth-first search base on A(G) to determine whether
iol
ing to a preorder traversal, an inorder traversal, and a postorder
G is connected.
traversal.
10. a) Let T = (V, E) bea binary tree. If |V| = n, what is the
maximum height that 7 can attain?
b) If T = (V, E) is a complete binary tree and |V| =x,
what is the maximum height that 7 can reach in this case?
11. Prove Theorem 12.6 and Corollary 12.1.
12. With m,n, i, € as in Theorem 12.6, prove that
a) n=(m£—1)/(m— 1). b) €=[(m—- 1)n4+1]/m.
13. a) A complete ternary (or 3-ary) tree T = (V, E) has 34
internal vertices. How many edges does 7 have? How
many leaves?
b) How many internal vertices does a complete 5-ary tree
with 817 leaves have?
14. The complete binary tree T = (V, E) has V = {a, b,c,
...,1, J, k}. The postorder listing of V yields d, e, b, h, i,
Figure 12.30 Ft, j,k, g, ¢, a. From this information draw 7 if (a) the height
of T is 3; (b) the height of the left subtree of 7 is 3.
6. List the vertices in the tree shown in Fig. 12.31 when they 15. For m > 3, a complete m-ary tree can be transformed into a
are visited in a preorder traversal] and in a postorder traversal. complete binary tree by applying the idea shown in Fig. 12.32.
a) Use this technique to transform the complete ternary
decision tree shown in Fig. 12.27(b).
b) If 7 is a complete quaternary tree of height 3, what is
the maximum height that 7 can have after it is transformed
into a complete binary tree? What is the minimum height?
c) Answer part (b) if 7 is a complete m-ary tree of
height A.
14 15 16 17
Figure 12.31
7. a) Find the depth-first spanning tree for the graph
shown in Fig. 11.72(a) if the order of the vertices is
given as (i) a, b,c, d,e, f, g, hi Gi) h, g, f, e, d,c, b, a;
(ili) a,b, c,d, h, g, f,e. 5; $2. 53 Sm
b) Repeat part (a) for the graph shown in Fig. 11.85(i).
8. Find the breadth-first spanning trees for the graphs and pre-
scribed orders given in Exercise 7.
9. LetG = (V, FE) be an undirected graph with adjacency ma-
trix A(G) as shown here. Figure 12.32
12.3 Trees and Sorting 605
16. a) Ata men’s singles tennis tournament, each of 25 players 23. Consider the following algorithm where the input is arooted
brings a can of tennis balls. When a match is played, one tree with root r.
can of balls is opened and used, then kept by the loser. The Step 1: Push r onto the (empty) stack
winner takes the unopened can on to his next match. How Step 2: While the stack is not empty
many cans of tennis balls will be opened during this tour- Pop the vertex at the top of
nament? How many matches are played in the tournament? the stack and record its label
b) In how many matches did the tournament champion Push the children — going from
play? right to left — of this vertex
17. What is the maximum number of internal vertices that a onto the stack
complete quaternary tree of height 8 can have? What is the (The stack data structure was explained in Example 10.43).
number for a complete m-ary tree of height 2?
18. On the first Sunday of 2003 Rizzo and Frenchie start a chain What is the output when this algorithm is applied to (a) the
letter, each of them sending five letters (to ten different friends tree in Fig. 12.19? (b) any rooted tree?
between them). Each person receiving the letter is to send five
copies to five new people on the Sunday following the letter’s 24. Consider the following algorithm where the input is a rooted
arrival. After the first seven Sundays have passed, what is the tree with root r.
total number of chain letters that have been mailed? How many Step 1: Push, onto the (empty) stack
were mailed on the last three Sundays? Step 2: While the stack is not empty
19. Use a complete ternary decision tree to repeat Example If the entry at the top of the stack is
12.15 for a set of 12 coins, exactly one of which is heavier (and not marked
counterfeit). Then mark it and push its
20. Let T = (V, E) be a balanced complete m-ary tree of children — right to left — onto
height # > 2. If T has @ leaves and b,_; internal vertices at the stack
level h — 1, explain why £ = m"~' + (m — 1)b,-1. Else
Pop the vertex at the top of the
21. Consider the complete binary trees on 31 vertices. (Here
stack and record its label
we distinguish left from right as in Example 12.9.) How many
of these trees have 11 vertices in the left subtree of the root? What is the output when the algorithm is applied to (a) the tree
How many have 21 vertices in the right subtree of the root? in Fig. 12.19? (b) any rooted tree?
22. Forn > 0, let a, count the number of complete binary trees
on 2n + 1 vertices. (Here we distinguish left from right as in
Example 12.9.) How is a@,4) related to ag, @}, @2,..., An—\, An?
12.3
Trees and Sorting
In Example 10.5, the bubble sort was introduced. There we found that the number of
comparisons needed to sort a list of m items is n(m — 1)/2. Consequently, this algorithm
determines a function h: Zt > R defined by h(n) = n(n — 1)/2. This is the (worst-case)
time-complexity function for the algorithm, and we often express this by writing h € O(n’).
Consequently, the bubble sort is said to require O(n?) comparisons. We interpret this to
mean that for large n, the number of comparisons is bounded above by cn’, where c is a
constant that is generally not specified because it depends on such factors as the compiler
and the computer that are used.
In this section we shall study a second method for sorting a given list of m items into
ascending order. The method is called the merge sort, and we shall find that the order of
its worst-case time-complexity function is O(n log, n). This will be accomplished in the
following manner:
1) First we shall measure the number of comparisons needed when n is a power of 2.
Our method will employ a pair of balanced complete binary trees.
606 Chapter 12. Trees
2) Then we shall cover the case for general n by using the optional material on divide-
and-conquer algorithms in Section 10.6.
For the case where 7 is an arbitrary positive integer, we start by considering the following
procedure.
Given a list of 7 items to sort into ascending order, the merge sort recursively splits the
given list and all subsequent sublists in half (or as close as possible to half) until each sublist
contains a single element. Then the procedure merges these sublists in ascending order until
the original items have been so sorted. The splitting and merging processes can best be
described by a pair of balanced complete binary trees, as in the next example.
EXAMPLE 12.16 Merge Sort. Using the merge sort, Fig. 12.33 sorts the list 6, 2, 7, 3, 4, 9, 5, 1, 8. The tree
: at the top of the figure shows how the process first splits the given list into sublists of size
1. The merging process is then outlined by the tree at the bottom of the figure.
6,2,7,3,4-9,5,1,8
6,2,7 -—3,4 9,5-1,8
6,2-7 3-4 9-5 1-8
6-2 7 3 4 9 5 1 8
6 2
6 2
2,6 7 3 4 9 5 1 /
2,6, 7 3,4 2,9 1,8
2, 3,4, 6,7 1,5, 8,9
1,2, 3,4, 5, 6, 7, 8, 9
Figure 12.33
To compare the merge sort to the bubble sort, we want to determine its (worst-case)
time-complexity function. The following lemma will be needed for this task.
LEMMA 12.1 Let L; and L> be two sorted lists of ascending numbers, where L; contains n; elements, for
i = 1, 2. Then L, and L» can be merged into one ascending list L using at most; +n — 1
comparisons.
Proof: To merge L;, L> into list L, we perform the following algorithm.
12.3 Trees and Sorting 607
Step 1: Set L equal to the empty list¥
Step 2: Compare the first elements in Li, La, Remove the smaller of the two from
the list it is in and place it at the end of L.
Step 3: For the present lists L,, Lo [one change is made in one of these lists each
time step (2) is executed], there are two considerations.
a) If either of L,, L2 is empty, then the other list is concatenated to the end
of L. This completes the merging process.
b) If not, return to step (2).
Each comparison of a number from L, with one from L> results in the placement of an
element at the end of list L, so there cannot be more than n; + m2 comparisons. When one
of the lists £,; or Ly becomes empty no further comparisons are needed, so the maximum
number of comparisons needed is 2; + n> — 1.
To determine the (worst-case) time-complexity function of the merge sort, consider a
list of m elements. For the moment, we do not treat the general problem, assuming here
that n = 2".* In the splitting process, the list of 2’ elements is first split into two sublists of
size 2'~', (These are the level | vertices in the tree representing the splitting process.) As
the process continues, each successive list of size 2" ~* h > k, is at level k and splits into
two sublists of size (1/2)(2"~*) = 2"~*—!. At level h the sublists each contain 2"~" = |
element.
Reversing the process, we first merge the n = 2" leaves into 2"—' ordered sublists of
size 2. These sublists are at level h — 1 and require (1/2)(2") = 2’~' comparisons (one per
pair). As this merging process continues, at each of the 2* vertices at level k, 1 <k <h,
there is a sublist of size 2"~*, obtained from merging the two sublists of size 2’~*—' at
its children (on level k + 1). From Lemma 12.1, this merging requires at most 2"~*-! +
2h-k-1 _ | = 2h-k _ | comparisons. When the children of the root are reached, there are
two sublists of size 2’~! (at level 1). To merge these sublists into the final list requires at
most 2"~! + 24! _ | = 2" — | comparisons.
Consequently, for | < k <h, at level k there are gk-l pairs of vertices. At each of these
vertices is a sublist of size 2’~*, so it takes at most 2’~*+' — 1 comparisons to merge each
pair of sublists. With 2*~! pairs of vertices at level &, the total number of comparisons at
level k is at most 2*~'(2"-*+! — 1). When we sum over all levels k, where 1 <k <h, we
find that the total number of comparisons is at most
h h—-| h-|
ders! — 1) = So kak * 1) = Sy 2! St -=h-2"—(2" ~ 1).
k=1 k=0 k=0 k=0
With n = 2", we have h = log, n and
h.2"
— (2° — 1) =nlogyn—(n—1) =nlogyn—n+l,
“The result obtained here forn = 2" Jf &N. is actually true forall n € Z* . However, the derivation for general
n requires the optional material in Section 10.6. That is why this counting argument is included here
— for the
benefit of those readers who did not cover Section 10.6.
608 Chapter 12 Trees
where n log, n is the dominating term for large n. Thus the (worst-case) time-complexity
function for this sorting procedure is g(n) =n log,n —n+ 1 and g € O(m log, n), for
n = 2", h © Z*. Hence the number of comparisons needed to merge sort a list of n items
is bounded above by dn log, n for some constant d, and for all n > no, where no is some
particular (large) positive integer.
To show that the order of the merge sort is O(n log, n) for all n € Z*, our second
approach will use the result of Exercise 9 from Section 10.6. We state that now:
Let a, b, c € Z*, with b > 2. If g: Z* > Rt U {0} is a monotone increasing function,
where
gl) <e,
g(n) <ag(nfb)+cn, forn=b', heZ,
then for the case where a = b, we have g € O(n log n), for all n € Z*. (The base for the
log function may be any real number greater than |. Here we shall use the base 2.)
Before we can apply this result to the merge sort, we wish to formulate this sorting
process (illustrated in Fig. 12.33) as a precise algorithm. To do so, we call the procedure
outlined in Lemma 12.1] the “merge” algorithm. Then we shall write “merge (L,, £2)” in
order to represent the application of that procedure to the lists L;, £2, which are in ascending
order.
The algorithm for merge sort is a recursive procedure because it may invoke itself. Here
the input is an array (called List) of n items, such as real numbers.
The MergeSort Algorithm
Step 1: Ifn = 1, then List is already sorted and the process terminates. Ifn > 1, then
go to step (2).
Step 2: (Divide the array and sort the subarrays.) Perform the following:
1) Assign m the value [n/2].
2) Assign to List 1 the subarray
List{1], List{2], ..., List{m].
3) Assign to List 2 the subarray |
Listfmm + 1], Listfm + 2], ..., List[7].
4) Apply MergeSort to List 1 (of size m) and to List 2 (of size n — m).
Step 3: Merge (List 1, List 2).
The function g: Z* > R* U {0} will measure the (worst-case) time-complexity for this
algorithm by counting the maximum number of comparisons needed to merge sort an array
ofn items. For n = 2", h € Z*+, we have
g(n) = 2g(n/2) + [(n/2) + (n/2) — IJ.
The term 2g(n/2) results from step (2) of the MergeSort algorithm, and the summand
[(2/2) + (n/2) — 1] follows from step (3) of the algorithm and Lemma 12.1.
12.4 Weighted Trees and Prefix Codes 609
With g(1) = 0, the preceding equation provides the inequalities
g(1) =0<1,
a(n) = 2g(n/2) + (n — 1) S 2g(n/2) +70, forn=2"' heZt.
We also observe that g(1) = 0, g(2) = 1, g(3) = 3, and g(4) = 5, so g(1) < g(2) <
g(3) < 9(4). Consequently, it appears that g may be a monotone increasing function. The
proof that it is monotone increasing is similar to that given for the time-complexity function
of binary search. This follows Example 10.49 in Section 10.6, so we leave the details
showing that g is monotone increasing to the Section Exercises.
Now witha = b = 2 andc = 1, the result stated earlier implies that g € O(n log, n) for
allneZ.
Although n log, n <n? for all n € Z*, it does not follow that because the bubble sort is
O(n’) and the merge sort is O(n log, n), the merge sort is more efficient than the bubble sort
for all n € Zt. The bubble sort requires less programming effort and generally takes less
time than the merge sort for small values of n (depending on factors such as the programming
language, the compiler, and the computer). However, as n increases, the ratio of the worst-
case running times, as measured by (cn”)/(dn log, n) = (c/d)(n/log, n), gets arbitrarily
large. Consequently, as the input list increases in size, the O(n?) algorithm (bubble sort)
takes significantly more time than the O(n log, 1) algorithm (merge sort).
For more on sorting algorithms and their time-complexity functions, the reader should
examine [1], [3], [4], [7], and [8] in the chapter references.
3. Related to the merge sort is a somewhat more efficient
procedure called the guick sort. Here we start with a list
L:@\,@,..., d,, and use a, as a pivot to develop two
1. a) Give an example of two lists Z,, L2, each of which is in
sublists L, and L> as follows. For i > 1, if a, <a), place a,
ascending order and contains five elements, and where nine at the end of the first list being developed (this is L, at the end
comparisons are needed to merge L,, L> by the algorithm of the process); otherwise, place a, at the end of the second
given in Lemma 12.1.
list Lo.
b) Let m,n € Z* with m < n. Give an example of two lists After all a,, i > 1, have been processed, place a, at the end
L,, L2, each of which is in ascending order, where L; has of the first list. Now apply quick sort recursively to each of the
m elements, L2 has n elements, and m +n — 1 compari- lists L, and L> to obtain sublists L1,, Liz, L2;, and Ly. Con-
sons are needed to merge L,, L2 by the algorithm given in tinue the process until each of the resulting sublists contains one
Lemma 12.1. element. The sublists are then ordered, and their concatenation
gives the ordering sought for the original list L.
2. Apply the merge sort to each of the following lists. Draw the Apply quick sort to each list in Exercise 2.
splitting and merging trees for each application of the procedure.
4. Prove that the function g used in the second method to an-
a) —l, 0, 2, —2, 3, 6, —3, 5, 1, 4 alyze the (worst-case) time-complexity of the merge sort is
b) -1, 7, 4, 11, 5, —8, 15, —3, —2, 6, 10, 3 monotone increasing.
12.4
Weighted Trees and Prefix Codes
Among the topics to which discrete mathematics is applied, coding theory is one wherein
different finite structures play a major role. These structures enable us to represent and
transmit information that is coded in terms of the symbols in a given alphabet. For instance,
the way we most often code, or represent, characters internally in a computer is by means
of strings of fixed length, using the symbols 0 and 1.
610 Chapter 12 Trees
The codes developed in this section, however, will use strings of different lengths. Why a
person should want to develop such a coding scheme and how the scheme can be constructed
will be our major concerns in this section.
Suppose we wish to develop a way to represent the letters of the alphabet using strings
of 0’s and 1’s. Since there are 26 letters, we should be able to encode these symbols in terms
of sequences of five bits, given that 2* < 26 < 2°. However, in the English (or any other)
language, not all letters occur with the same frequency. Consequently, it would be more
efficient to use binary sequences of different lengths, with the most frequently occurring
letters (such as e, i, f) represented by the shortest possible sequences. For example, consider
S = {a, e,n, r,t}, a subset of the alphabet. Represent the elements of S by the binary
sequences
a: 01 e:0 n: 101 r: 10 t: 1.
If the message “ata” is to be transmitted, the binary sequence 01101 is sent. Unfortunately,
this sequence is also transmitted for the messages “etn”, “atet”, and “an”.
Consider a second encoding scheme, one given by
a: 111 e: 0 n: 1100 r: 1101 t: 10.
Here the message “ata” 1s represented by the sequence 11110111 and there are no other
possibilities to confuse the situation. What’s more, the labeled complete binary tree shown
in Fig. 12.34 can be used to decode the sequence 11110111. Starting at the root, traverse the
edge labeled 1 to the right child (of the root). Continuing along the next two edges labeled
with 1, we arrive at the leaf labeled a. Hence the unique path from the root to the vertex
at a is unambiguously determined by the first three 1’s in the sequence 11110111. After
we return to the root, the next two symbols in the sequence — namely, 10 — determine the
unique path along the edge from the root to its right child, followed by the edge from that
child to its left child. This terminates at the vertex labeled t. Again returning to the root,
the final three bits of the sequence determine the letter a for a second time. Hence the tree
“decodes” 11110111 as ata.
a. 111
r: 1101
Figure 12.34
Why did the second encoding scheme work out so readily when the first led to ambigu-
ities? In the first scheme, r is represented as 10 and n as 101. If we encounter the symbols
10, how can we determine whether the symbols represent r or the first two symbols of i01,
which represent 2? The problem is that the sequence for r is a prefix of the sequence for
12.4 Weighted Trees and Prefix Codes 611
n. This ambiguity does not occur in the second encoding scheme, suggesting the following
definition.
Definition 12.7 A set P of binary sequences (representing a set of symbols) is called a prefix code if no
sequence in P is the prefix of any other sequence in P.
Consequently, the binary sequences 111, 0, 1100, 1101, 10 constitute a prefix code for
the letters a, e, n, r, t, respectively. But how did the complete binary tree of Fig. 12.34
come about? To deal with this problem, we need the following concept.
Definition 12.8 If T is a complete binary tree of height h, then T is called a full binary tree if all the leaves
in T are at level h.
For the prefix code P = {111, 0, 1100, 1101, 10}, the longest binary sequence has length
EXAMPLE 12.17
4. Draw the labeled full binary tree of height 4, as shown in Fig. 12.35. The elements of P
are assigned to the vertices of this tree as follows. For example, the sequence 10 traces the
path from the root r to its right child cr. Then it continues to the left child of cr, where the
box (marked with the asterisk) indicates completion of the sequence. Returning to the root,
the other four sequences are traced out in similar fashion, resulting in the other four boxed
vertices. For each boxed vertex remove the subtree (except for the root) that it determines.
The resulting pruned tree is the complete binary tree of Fig. 12.34, where no “box” is an
ancestor of another “box.”
x
1 0 1 0
Q
Figure 12.35
We turn now to a method for determining a labeled tree that models a prefix code, where
the frequency of occurrence of each symbol in the average text is taken into account — in
other words, a prefix code wherein the shorter sequences are used for the more frequently
occurring symbols. If there are many symbols, such as all 26 letters of the alphabet, a
trial-and-error method for constructing such a tree is not efficient. An elegant construction
developed by David A. Huffman (1925-1999) provides a technique for constructing such
trees.
The general problem of constructing an efficient tree can be described as follows.
Let w), w2,..., W, be a set of positive numbers called weights, where w, < w2 <
-++<w,.If T = (V, E) is a complete binary tree with n leaves, assign these weights (in
612 Chapter 12 Trees
any one-to-one manner) to the n leaves. The result is called a complete binary tree for the
weights W|, W2,..., W,. The weight of the tree, denoted W(T), is defined as ea w;£(w,)
where, for each 1 <i <n, £(w,) is the level number of the leaf assigned the weight w,.
The objective is to assign the weights so that W(7) is as small as possible. A complete
binary tree T’ for these weights is said to be an optimal tree if W(T’) < W(T) for any other
complete binary tree T for the weights.
Figure 12.36 shows two complete binary trees for the weights 3, 5, 6, and 9. For tree 7),
W(T)) = yt w,€(w;) = 8+94+5+46) -2 = 46 because each leaf has level number 2.
In the case of 72, W(72) = 3-34+5-34+6-2+49-1 = 45, which we shall find is optimal.
9 9
6 6
3 95 6 v 5
3 5
(T;) (To)
1 2
Figure 12.36 Figure 12.37
The major idea behind Huffman’s construction is that in order to obtain an optimal tree
T for the n weights w), w2, w3, ..., Wy, one considers an optimal tree 7’ for the n — |
weights w , + w2, W3,..., W,. (It cannot be assumed that w; + w2 < w3.) In particular,
the tree 7’ is transformed into 7 by replacing the leaf v having weight w; + w2 by a tree
rooted at v of height 1 with left child of weight w, and right child of weight w. To illustrate,
if the tree 7, in Fig. 12.36 is optimal for the four weights 1 + 2, 5, 6, 9, then the tree in
Fig. 12.37 will be optimal for the five weights 1, 2, 5, 6, 9.
We need the following lemma to establish these claims.
LEMMA 12.2 If 7 is an optimal tree for the n weights w, < w2 <-+-- < w,, then there exists an optimal
tree 7’ in which the leaves of weights w, and w2 are siblings at the maximal level (in 7’).
Proof: Let v be an internal vertex of T where the level number of v is maximal for all
internal vertices. Let w, and w, be the weights assigned to the children x, y of vertex
v, with w, < wy. By the choice of vertex v, £(w,) = (wy) > &(wr), (wz). Consider the
case of w, < w,. (If w; = w,, then w) and w, can be interchanged and we would consider
the case of w2 < w,. Applying the following proof to this case, we would find that w, and
w2 can be interchanged.)
IfL(w,) > £(w)),let £(w,) = £(w)) + j, forsomej € Z*. Then w)é(wy) + w,£(wx) =
wi l(w,) + wy[l(w)) + J] = wi l(wy) + wyJ + w,l(w)) > wy l(w1) + wig +
wWyl(w)) = wi l(wx) + wyl(w)). So WT) = wi £(wy) + wy l(wy) + igtx w;€(w;) >
wy E(wy) tw, lw) + osha w,€(w;). Consequently, by interchanging the locations of
the weights w, and w,, we obtain a tree of smaller weight. But this contradicts the choice
of 7 as an optimal tree. Therefore £(w,) = €(w)) = €(w,). In a similar manner, it can be
shown that ¢(w,) = €(w2), so €(w,) = (wy) = £(w)) = €(w2). Interchanging the loca-
tions of the pair w), w,, and the pair w2, wy, we obtain an optimal tree 7’, where w), w2
are siblings.
12.4 Weighted Trees and Prefix Codes 613
From this lemma we see that smaller weights will appear at the higher levels (and thus
have higher level numbers) in an optimal tree.
THEOREM 12.8 Let 7 be an optimal tree for the weights w, + w2, w3,..., w,, Where w; < w2 < w3 <
-< w,. At the leaf with weight w; + w2 place a (complete) binary tree of height 1 and
assign the weights w;, w to the children (leaves) of this former leaf. The new binary tree
T; so constructed is then optimal for the weights w), w2, w3,..., Wp.
Proof: Let 7> be an optimal tree for the weights w), w2,..., Wy, where the leaves for
weights w 1, wz are siblings. Remove the leaves of weights w), w2 and assign the weight
w + w2 to their parent (now a leaf). This complete binary tree is denoted 7; and W(7>) =
W(T3) + w, + w2. Also, W(T|) = W(T) + w) + wr. Since T is optimal, W(T) < W(T3).
If W(T) < W(73), then W(7,) < W(T)), contradicting the choice of T> as optimal. Hence
W(T) = W(T3) and, consequently, W(7T,) = W(T>). So T, is optimal for the weights
Wy, Wo, ..., Wr.
Remark. The preceding proof started with an optimal tree 7, whose existence rests on the
fact that there is only a finite number of ways in which we can assign n weights to a complete
binary tree with n leaves. Consequently, with a finite number of assignments there is at least
one where W(T) is minimal. But finite numbers can be large. This proof establishes the
existence of an optimal tree for a set of weights and develops a way for constructing such
a tree. To construct such a (Huffman) tree we consider the following algorithm.
Given the m (> 2) weights w;, wz,..., Wy», proceed as follows:
:
Step 1: Assign the given weights, one each to a set § of ft isolated west
vertex is the root of a complete binary tree (of height 0 with a. weigh
to it.}
Step 2: While [5] > 1 perform the following: |
a) Find two trees 7’, T’ in S with the smallest two rootot weh a,
respectively. : at
b) Create the new (complete binary) tree T* with toot weight w= oO
w + w’ and having T, T’ as its left and right subtrees, respectively.
c) Place T* in § and delete T and 7’. [Where {S|= 1, the one complete .
binary tree in 8 is a Huffman tree.]
We now use this algorithm in the following example.
Construct an optimal prefix code for the symbols a, 0, g, u, y, z that occur (in a given
EXAMPLE 12.18
sample) with frequencies 20, 28, 4, 17, 12, 7, respectively.
Figure 12.38 shows the construction that follows Huffman’s procedure. In part (b)
weights 4 and 7 are combined so that we then consider the construction for the weights 11,
12, 17, 20, 28. At each step [in parts (c)-(f) of Fig. 12.38] we create a tree with subtrees
rooted at the two smallest weights. These two smallest weights belong to vertices each of
which is originally either isolated (a tree with just a root) or the root of a tree obtained
earlier in the construction. From the last result, a prefix code is determined as
a: 01 o: 11 q: 1000 u: 00 y: 101 z: 1001.
614 Chapter 12 Trees
e e e e e e
4 7 12 17 20 28 51
(a)
11 23
28
37 "1
e e e e 12
4 7 12 17 20 28
b
(0) 17 20 4 7
23
(e)
14 12
e e e
17-20 4 7 28
(c)
23
37
11 12 /\
e
4 7 28 17 20
(d) (f)
Figure 12.38
Different prefix codes may result from the way the trees 7, T’ are selected and assigned as
the left or right subtree in steps 2(a) and 2(b) in our algorithm and from the assignment of
0 or | to the branches (edges) of our final (Huffman) tree.
7, Using the weights 2, 3, 5, 10, 10, show that the height of
te a Huffman tree for a given set of weights is not unique. How
would you modify the algorithm so as to always produce a Huff-
1. For the prefix code given in Fig. 12.34, decode the sequences man tree of minimal height for the given weights?
(a) 1001111101; (b) 10111100110001101; (c) 1101111110010.
8. Let L,, for 1 < i <4, be four lists of numbers, each sorted
2. A code for {a, b, c,d, e} is given by a: 00 6:01 c: 101
in ascending order. The numbers of entries in these lists are 75,
d:x10 e: yzl, where x, y, z € {0, 1}. Determine x, y, and z
40, 110, and 50, respectively.
so that the given code is a prefix code.
a) How many comparisons are needed to merge these four
3. Construct an optimal prefix code for the symbols . , :
lists by merging L, and L»2, merging L3 and L4, and then
a,b,c,...,%, j that occur (in a given sample) with respective : : <
frequencies 78, 16, 30, 35, 125, 31, 20, 50, 80, 3. merging the two resulting lists?
4. How many leaves does a full binary tree have if its height is b) How many comparisons are needed if we first merge L,
(a) 3? (b) 7? (c) 12? (d) h? and Lo, then merge the result with 13, and finally merge this
5. Let T = (V, E) be a complete m-ary tree of height 4. This result with a?
tree is called a full m-ary tree if all of its leaves are at level h. c) In order to minimize the total number of comparisons in
If T is a full m-ary tree with height 7 and 279,936 leaves, how this merging of the four lists, what order should the merging
many internal vertices are there in T? follow?
6. Let T be a full m-ary tree with height / and v vertices. De- d) Extend the result in part (c) to m sorted lists L, Lo,
termine / in terms of m and v. woes Ene
12.5 Biconnected Components and Articulation Points 615
12.5
Biconnected Components
and Articulation Points
Let G = (V, E) be the loop-free connected undirected graph shown in Fig. 12.39(a), where
each vertex represents a communication center. Here an edge {x, y} indicates the existence
of a communication link between the centers at x and y.
>
(a) (b)
Figure 12.39
By splitting the vertices at c and f, in the suggested fashion, we obtain the collection of
subgraphs in part (b) of the figure. These vertices are examples of the following.
Definition 12.9 A vertex v in a loop-free undirected graph G = (V, £) is called an articulation point
if «(G — v) > k(G); that is, the subgraph G — v has more components than the given
graph G.
A loop-free connected undirected graph with no articulation points is called biconnected.
A biconnected component of a graph is a maximal biconnected subgraph — a bicon-
nected subgraph that is not properly contained in a larger biconnected subgraph.
The graph shown in Fig. 12.39 has the two articulation points, c and f, and its four
biconnected components are shown in part (b) of the figure.
In terms of communication centers and links, the articulation points of the graph in-
dicate where the system is most vulnerable. Without articulation points, such a system is
more likely to survive disruptions at a communication center, regardless of whether these
disruptions are caused by the breakdown of a technical device or by external forces.
The problem of finding the articulation points in a connected graph provides an applica-
tion for the depth-first spanning tree. The objective here is the development of an algorithm
that determines the articulation points of a loop-free connected undirected graph. If no
such points exist, then the graph is biconnected. Should such vertices exist, the resulting
biconnected components can be used to provide information about such properties as the
planarity and chromatic number of the given graph.
The following preliminaries are needed for developing this algorithm.
616 Chapter 12 Trees
Returning to Fig. 12.39(a), we see that there are four paths from a to e—namely,
(ha>smcoeQa7>cod>eBa>boc>ezand(4)a>obocod-e.
Now what do these four paths have in common? They all pass through the vertex c, one of
the articulation points of G. This observation now motivates our first preliminary result.
LEMMA 12.3 Let G = (V, E) be a loop-free connected undirected graph with z € V. The vertex z is an
articulation point of G if and only if there exist distinct x, y € V with x #z, y # z, and
such that every path in G connecting x and y contains the vertex z.
Proof: This result follows from Definition 12.9. A proof is requested of the reader in the
Section Exercises.
Our next lemma provides an important and useful property of the depth-first spanning
tree.
LEMMA 12.4 Let G = (V, E) be a loop-free connected undirected graph with T = (V, E£’) a depth-first
spanning tree of G. If {a, b} € E but {a, b} ¢ E’, then a is either an ancestor or a descendant
of b in the tree T.
Proof: From the depth-first spanning tree 7, we obtain a preorder listing for the vertices in
V. For all v € V, let dfi(v) denote the depth-first index of vertex v — that is, the position
of v in the preorder listing. Assume that dfi(a) < dfi(b). Consequently, a is encountered
before b in the preorder traversal of T, so a cannot be a descendant of b. If, in addition,
vertex @ is not an ancestor of b, then d is not in the subtree 7, of T rooted at a. But when we
backtrack (through 7,) to a, we find that because {a, b} € EF, it should have been possible
for the depth-first search to go from a to b and to use the edge {a, b} in T. This contradiction
shows that b is in 7,, so a is an ancestor of b.
If G = (V, E) isaloop-free connected undirected graph, let T = (V, E’) be a depth-first
spanning tree for G, as shown in Fig. 12.40. By Lemma 12.4, the dotted edge {a, b}, which
is not part of 7, indicates an edge that could exist in G. Such an edge is called a back edge
(relative to 7), and here a is an ancestor of b. [Here dfi(a) = 3, whereas dfi(b) = 6.| The
dotted edge {b, d} in the figure cannot exist in G, also because of Lemma 12.4. Thus all
edges of G are either edges in T or back edges (relative to T).
Root
Figure 12.40
12.5 Biconnected Components and Articulation Points 617
Our next example provides further insight into the relationship between the articulation
points of a graph G and a depth-first spanning tree of G.
In part (1) of Fig. 12.41 we have a loop-free connected undirected graph G = (V, E).
EXAMPLE 12.19
Applying Lemma 12.3 to vertex a, for example, we find that the only path in G from b
to ¢ passes through a. In the case of vertex d, we apply the same lemma and consider the
vertices a and h. Now we find that although there are four paths from a to h, all four pass
through vertex d. Consequently, vertices a and d are two of the articulation points in G.
The vertex / is the only other articulation point. Can you find two vertices in G for which
all connecting paths (for these vertices) in G pass through h?
(1) G=(V, €) (2) T=(V,E”) (3) G=\,E) (4) T"=(V,E") (3) G=WE)
Figure 12.41
Applying the depth-first search algorithm, with the vertices of G ordered alphabetically,
in part (2) of Fig. 12.41, we find the depth-first spanning tree T’ = (V, E’) for G, where
a has been chosen as the root. The parenthesized integer next to each vertex indicates the
order in which that vertex is visited during the prescribed depth-first search. Part (3) of the
figure incorporates the three back edges (relative to 7, in G) that are missing from part (2).
For the tree T’, the root a, which is an articulation point in G, has more than one child.
The articulation point d has a child— namely, g — with no back edge from g or any of its
descendants (# and j) to an ancestor of d [as we see in part (3) of Fig. 12.41]. The same is
true for the articulation point /. Its child 7 has (no children and) no back edge to an ancestor
of h,
In part (4) of the figure, T” = (V, FE”) is the depth-first spanning tree for the vertices
ordered alphabetically once again, but this time vertex g has been chosen as the root. As
in part (2) of the figure, the parenthesized integer next to each vertex indicates the order in
which that vertex is visited during this depth-first search. The three back edges (relative to
T”, in G) that are missing from T” are shown in part (5) of the figure.
The root g of T” has only one child and g is not an articulation point in G. Further, for
each of the articulation points there is at least one child with no back edge from that child
or one of its descendants to an ancestor of the articulation point. To be more specific, from
part (5) of Fig. 12.41 we find that for the articulation point a we may use any of the children
b,c ori, but not f; for d that child is a; and for f the child is /.
The observations made in Example 12.19 now lead us to the following.
618 Chapter 12 Trees
LEMMA 12.5 Let G = (V, E) be a loop-free connected undirected graph with T = (V, E’) a depth-first
spanning tree of G. If r is the root of 7, then r is an articulation point of G if and only ifr
has at least two children in 7.
Proof: If has only one child — say, c —then all the other vertices of G are descendants of
c (andr) in 7. So if x, y are two distinct vertices of JT, neither of which is r, then in the
subtree 7,, rooted at c, there is a path from x to y. Since r is not a vertex in 7,., r is not
on this path. Consequently, r is not an articulation point in G — by virtue of Lemma 12.3.
Conversely, let r be the root of the depth-first spanning tree 7 and let ¢), cz be children of
r. Let x be a vertex in 7,,, the subtree of T rooted at c. Similarly, let y be a vertex in 7%,
the subtree of T rooted at cz. Could there be a path from x to y in G that avoids r? If so,
there is an edge {v,, v2} in G with v; in 7, and v2 in T,,. But this contradicts Lemma 12.4.
Our final preliminary result settles the issue of when a vertex, that is not the root of a
depth-first spanning tree, is an articulation point of a graph.
LEMMA 12.6 Let G = (V, E) be a loop-free connected undirected graph with T = (V, E’) a depth-first
spanning tree for G. Let r be the root of T and let v€ V, v € r. Then v is an articulation
point of G if and only if there exists a child c of v with no back edge (relative to 7, in G)
from a vertex in 7,, the subtree rooted at c, to an ancestor of v.
Proof: Suppose that vertex v has a child c such that there is no back edge (relative to 7,
in G) from a vertex in 7, to an ancestor of v. Then every path (in G) from r to c passes
through v. From Lemma 12.3 it then follows that v is an articulation point of G.
To establish the converse, let the nonroot vertex v of T satisfy the following: For each
child c of v there is a back edge (relative to 7, in G) from a vertex in 7,., the subtree rooted
at c, to an ancestor of v. Now let x, y € V with x # vu, y # v. We consider the following
three possibilities:
1) If neither x nor y is a descendant of v, as in part (1) of Fig. 12.42, delete from T the
subtree T, rooted at v. The resulting subtree (of T) contains x, y and a path from x
to y that does not pass through v, so v is not an articulation point of G.
(1)
Figure 12.42
12.5 Biconnected Components and Articulation Points 619
2) If one of x, y—say, x —is a descendant of v but y is not, thenx is a child of v ora
descendant of a child c of v [as in part (2) of Fig. 12.42]. From the hypothesis there
is a back edge (relative to T, in G) from some z € 7, to an ancestor w of v. Since
x, z © T,, there is a path p, from x to z (that does not pass through v). Then, as neither
w nor y is a descendant of v, from part (1) there is a path p2 from w to y that does
not pass through v. The edges in p), p2 together with the edge {z, w} provide a path
from x to y that does not pass through v — and once again, v is not an articulation
point.
3) Finally, suppose that both x, y are descendants of v, as in part (3) of Fig. 12.42. Here
C1, ¢2 are children of vy — perhaps, with c, = c2 — and x is a vertex in 7,,, the subtree
rooted at c,, while y is a vertex in 7,,, the subtree rooted at c2. From the hypothesis,
there exist back edges {d|, aj} and {d2, az} (relative to T, in G), where d, d> are
descendants of v and a,, a are ancestors of v. Further, there is a path p; from x to
d, in T,, and a path p2 from y to d2 in T,,. As neither a; nor a2 is a descendant of v,
from part (1) we have a path p (in 7) from a; to a2, where p avoids v. Now we can
do the following: (i) Go from x to d; using path p); (ii) Go from d| to a; on the edge
{d,, a1}; (iii) Continue to a2 using path p; (iv) Go from az to dz on the edge {a2, d2};
and (v) Finish at y using the path p2 from d2 to y. This provides a path from x to y
that avoids v so v is not an articulation point of G and this completes the proof.
Using the results from the preceding four lemmas, we once again start with a loop-free
connected undirected graph G = (V, E) with depth-first spanning tree 7. For v € V, where
v is not the root of 7, we let 7,,. be the subtree consisting of edge {v, c} (c a child of v)
together with the tree 7, rooted at c. If there is no back edge from a descendant of v in
T,,- to an ancestor of v (and v has at least one ancestor — the root of 7), then the splitting
of vertex v results in the separation of 7,,. from G, and v is an articulation point. If no
other articulation points of G occur in 7,,-, then the addition to 7,,, of all other edges in G
determined by the vertices in 7, (the subgraph of G induced by the vertices in 7,,-) results
in a biconnected component of G. A root has no ancestors, and it is an articulation point if
and only if it has more than one child.
The depth-first spanning tree preorders the vertices of G. For x € V let dfi(x) denote the
depth-first index of x in that preorder. If y is a descendant of x, then dfi(x) < dfi(y). For y
an ancestor of x, dfi(x) > dfi(y). Define low(x) = min{dfi(y)|y is adjacent in G to either
x or a descendant of x}. If z is the parent of x (in 7), then there are two possibilities to
consider:
1) low(x) = dfi(z): In this case 7,, the subtree rooted at x, contains no vertex that is
adjacent to an ancestor of z by means of a back edge of T. Hence z is an articulation
point of G. If 7, contains no articulation points, then 7, together with edge {z, x}
spans a biconnected component of G (that is, the subgraph of G induced by vertex
z and the vertices in 7, is a biconnected component of G). Now remove 7, and the
edge {z, x} from 7, and apply this idea to the remaining subtree of T.
2) low(x) < dfi(z): Here there is a descendant of z in 7, that is joined [by a back edge
(relative to 7, in G)]| to an ancestor of z.
To deal in an efficient manner with these ideas, we develop the following algorithm.
Let G = (V, E) be a loop-free connected undirected graph.
620 Chapter 12 Trees
Step 1: Find the depth-first spanning tree J for G according to a prescribed order.
Let x1, x2, ..., Sy be the vertices of G preordered by 7. Then dfi(x;) = j for all
L<jsn.
Step 2: Start with x, and continue back to X,~1, Xn~2,.-., 3, X2, Xs, determining
low(x;), for j =n, n—-1,n—-2,...,3, 2, 1, recursively, as follows:
a) low'(x;) = min{dfi(z)|z is adjacent in G to x;}.
b) If ci, ¢2, ..., Cm arethe children of x ;,thenlow(x;) = min{low (x;),
low(c1), low(c2), ... , low(cy,)}. [No problem arises here, for the ver-
tices are examined in the reverse order to the given preorder, Conse-
quently, if c is a child of p, then low(c) is determined before low(p).]
Step 3: Let w, be the parent
of x; in TJ. Iflow(x,) = dfi(w,), then wis an articulation
point of G, unless w is the root of 7 and w; has nochildin T other than x;. Moreover,
in either situation the subtree rooted at x, together with the edge {w,, x;} is part of
a biconnected component of G.
We apply this algorithm to the graph G = (V, E) shown in part (i) of Fig. 12.43.
EXAMPLE 12.20
(int) (lv) (v)
Figure 12.43
In part (ii) of the figure we have the depth-first spanning tree T = (V, E’) for G with
d as the root. (Here the order followed for the vertices of G is alphabetic.) Next to each
vertex v of 7 [in part (ii)| is the dfi(v). These labels tell us the order in which the vertices
of G are first visited.
For step (2) of the algorithm we go in the reverse order from the depth-first search
and start with vertex h(= xg). Since {g, h} € E and h is not adjacent to any other vertex
of G we have low’ (h) = dfi(g) [= dfi(x7)] = 7. Further, as A has no children, it follows
that low() = low’(h#) = 7. This accounts for the label (7, 7) [= (ow’(h), low(h))] next
12.5 Biconnected Components and Articulation Points 621
to A in part (ii) of Fig. 12.43. Continuing next with g, and then f, we obtain the labels
(6, 6) for g, and (1, 1) for f, since low’(g) = low(g) = 6 and low’(f) =low(f) = 1.
Since {a, e}, {a, f} € E with dfi(e) = 4 and dfi(f) = 6, for vertex a we have low’(a) =
min{4, 6} = 4. Then we find that low(a) = min{4, low(f)} = min{4, 1} = 1. Hence the
label (4, 1) for vertex a. Continuing back through e, c, b, and d, we obtain the labels
(low’(x;), low(x;)) for i = 4, 3, 2, 1. Consequently, by applying step (2) of the algorithm
we arrive at the tree in Fig. 12.43 (iii).
In part (iv) of Fig. 12.43 the ordered pair next to each vertex v is (dfi(v), low(v)).
Applying step (3) of the algorithm to the tree in part (iv), at this point we go in reverse
order once again. First we deal with vertex h (= xg). Since g is the parent of h (in 7) and
low(h) = 7 = dfi(g), g is an articulation point of G and the edge {h, g} is a biconnected
component of G. Deleting the subtree rooted at g from 7, we continue with vertex g
(= x7). Here f is the parent of g (in the tree T — h) and low(g) = 6 = dfi(f), so f is
another articulation point — with edge {g, f} the corresponding biconnected component.
Continuing now with the tree (T — h) — g, as we go from f to a to e, and then from c
to b, we find no new articulation points among the four vertices a, e, c, and b. Since vertex
d is the root of T and d has two children—namely, the vertices b and e, it then follows
from Lemma 12.5 that d is an articulation point of G. The vertices d, e, a, f induce the
biconnected component consisting of the tree edges { f, a}, {a, e}, {e, d} and the back edges
(relative to T, in G) {f, e} and { f, d}. Finally, the cycle induced (in G) by the vertices b, c
and d provides the fourth biconnected component.
Part (v) of Fig. 12.43 shows the three articulation points g, f, and d, and the four
biconnected components of G.
b) Let G = (V, E) be a loop-free connected undirected
EXERCISES 12.5 graph with |£| > 1. Prove that G has at least two vertices
that are not articulation points.
1, Find the articulation points and biconnected components
5. If By), Boy... B, are the biconnected components of a
for the graph shown in Fig. 12.44.
loop-free connected undirected graph G, how is x (G) related
to x (B,), 1 <i < k? [Recall that x(G) denotes the chromatic
a number of G, as defined in Section 11.6.]
b C
6. Let G = (V, E) be a loop-free connected undirected graph
with biconnected components B,, B.,..., Bg. For 1 <i <8,
f J the number of distinct spanning trees for B, is n,. How many
e
distinct spanning trees exist for G?
d q 7. Let G = (V, E) bea loop-free connected undirected graph
9g A i with |V| > 3. If G has no articulation points, prove that G has
Figure 12.44 no pendant vertices.
8. For the loop-free connected undirected graph G in
2. Prove Lemma 12.3. Fig. 12.43(i), order the vertices alphabetically.
3. Let 7 = (V, E) be
a tree with |V| =n > 3. a) Determine the depth-first spanning tree 7 for G with e
a) What are the smallest and the largest numbers of artic- as the root.
ulation points that T can have? Describe the trees for each b) Apply the algorithm developed in this section to the tree
of these cases. T in part (a) to find the articulation points and biconnected
b) How many biconnected components does 7 have in components of G.
each of the cases in part (a)? 9. Answer the questions posed in the previous exercise but
4. a) Let T = (V, E) be a tree. If v € V, prove that v is an this time order the vertices as h, g, f, e, d, c, b, a and let c be
articulation point of 7 if and only if deg(v) > 1. the root of T.
622 Chapter 12 Trees
10. LetG = (V, E) bea loop-free connected undirected graph, 11. In step (2) of the algorithm for articulation points, is it really
where V = {a, b, c,..., h, i, 7}. Ordering the vertices alpha- necessary to compute low(x,) and low(x2)?
betically, the depth-first spanning tree T for G — with a as the 12, Let G = (V, E) be a loop-free connected undirected graph
root—is given in Fig. 12.45(i). In part (ii) of the figure the withv eV.
ordered pair next to each vertex v provides (low’(v), low(v)).
a) Prove that G — v = G — v.
Determine the articulation points and the spanning trees for the
biconnected components of G. b) If v is an articulation point of G, prove that v cannot be
an articulation point of G.
13. If G = (V, E) is a loop-free undirected graph, we call G
color-critical if x(G — v) < x(G) forall v € V. (We examined
such graphs earlier, in Exercise 19 of Section 11.6.) Prove that
a color-critical graph has no articulation points.
14. Does the result in Lemma 12.4 remain true if T = (V, E’)
is a breadth-first spanning tree for G = (V, E)?
Figure 12.45
12.6
Summary and Historical Review
The structure now called a tree first appeared in 1847 in the work of Gustav Kirchhoff
(1824-1887) on electrical networks. The concept also appeared at this time in Geometrie
die Lage, by Karl von Staudt (1798-1867). In 1857 trees were rediscovered by Arthur
Cayley (1821-1895), who was unaware of these earlier developments. The first to call the
structure a “tree,” Cayley used it in applications dealing with chemical isomers. He also
investigated the enumeration of certain classes of trees. In his first work on trees, Cayley
enumerated unlabeled rooted trees. This was then followed by the enumeration of unlabeled
ordered trees. Two of Cayley’s contemporaries who also studied trees were Carl Borchardt
(1817-1880) and Marie Ennemond Jordan (1838-1922).
Arthur Cayley (1821-1895)
12.6 Summary and Historical Review 623
The formula n”~2 for the number of labeled trees on n vertices (Exercise 21 at the end of
Section 12.1) was discovered in 1860 by Carl Borchardt. Cayley later gave an independent
development of the formula, in 1889. Since then, there have been other derivations. These
are surveyed in the book by J. W. Moon [10].
The paper by G. Polya [11] is a pioneering work on the enumeration of trees and other
combinatorial structures. Polya’s theory of enumeration, which we shall see in Chapter 16,
was developed in this work. For more on the enumeration of trees, the reader should see
Chapter 15 of F. Harary [5]. The article by D. R. Shier [12] provides a labyrinth of several
different techniques for calculating the number of spanning trees for K2,,.
The high-speed digital computer has proved to be a constant impetus for the discovery of
new applications of trees. The first application of these structures was in the manipulation of
algebraic formulae. This dates back to 1951 in the work of Grace Murray Hopper. Since then,
computer applications of trees have been widely investigated. In the beginning, particular
results appeared only in the documentation of specific algorithms. The first genera! survey
of the applications of trees was made in 1961 by Kenneth Iverson as part of a broader
survey on data structures. Such ideas as preorder and postorder can be traced to the early
1960s, as evidenced in the work of Zdzislaw Pawlak, Lyle Johnson, and Kenneth Iverson.
At this time Kenneth Iverson also introduced the name and the notation, namely [x], for
the ceiling of a real number x. Additional material on these orders and the procedures for
their implementation on a computer can be found in Chapter 3 of the text by A. V. Aho,
J. E. Hopcroft, and J. D. Ullman [1]. In the article by J. E. Atkins, J. S. Dierckman, and
K. O’ Bryant [2], the notion of preorder is used to develop an optimal route for snow removal.
Rear Admiral Grace Murray Hopper (1906-1992) salutes as she and Navy Secretary
John Lehman leave the U.S.S Constitution.
AP/World Wide Photos
624 Chapter 12 Trees
If G = (V, E)isaloop-free undirected graph, then the depth-first search and the breadth-
first search (given in Section 12.2) provide ways to determine whether the given graph is
connected. The algorithms developed for these searching procedures are also important in
developing other algorithms. For example, the depth-first search arises in the algorithm
for finding the articulation points and biconnected components of a loop-free connected
undirected graph. If |V| =n and |£| = e, then it can be shown that both the depth-first
search and the breadth-first search have time-complexity O(max{n, e}). For most graphs
e >n, so the algorithms are generally considered to have time-complexity O(e). These
ideas are developed in great detail in Chapter 7 of S. Baase and A. Van Gelder [3], where
the coverage also includes an analysis of the time-complexity function for the algorithm (of
Section 12.5) that determines articulation points (and biconnected components). Chapter 6
of the text by A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1] also deals with the depth-first
search, whereas Chapter 7 covers the breadth-first search and the algorithm for articulation
points.
More on the properties and computer applications of trees is given in Section 3 of Chapter
2 in the work by D. E. Knuth [7]. Sorting techniques and their use of trees can be further
studied in Chapter 11 of A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1] and in Chapter 7
of T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein [4]. An extensive investigation
will warrant the coverage found in the text by D. E. Knuth [8].
The technique in Section 12.4 for designing prefix codes is based on methods developed
by D. A. Huffman [6].
David A. Huffman
University of Florida, Department of Computer and Information Science and Engineering
Finally, Chapter 7 of C. L. Liu [9] deals with trees, cycles, cut-sets, and the vector spaces
associated with these ideas. The reader with a background in linear or abstract algebra
should find this material of interest.
REFERENCES
1. Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffrey D. Data Structures and Algorithms.
Reading, Mass.: Addison-Wesley, 1983.
2. Atkins, Joe] E., Dierckman, Jeffrey S., and O’ Bryant, Kevin. “A Real Snow Job.” The UMAP
Journal, Fall no. 3 (1990): pp. 231-239.
Supplementary Exercises 625
. Baase, Sara, and Van Gelder, Allen. Computer Algorithms: Introduction to Design and Analysis,
3rd ed. Reading, Mass.: Addison-Wesley, 2000.
. Cormen, Thomas H., Leiserson, Charles E., Rivest, Ronald L., and Stein, Clifford. Introduction
to Algorithms, 2nd ed. Boston, Mass.: McGraw-Hill, 2001.
. Harary, Frank. Graph Theory. Reading, Mass.: Addison-Wesley, 1969.
. Huffman, David A. “A Method for the Construction of Minimum Redundancy Codes.” Pro-
ceedings of the IRE 40 (1952): pp. 1098-1101.
. Knuth, Donald E. The Art of Computer Programming, Vol. 1, 2nded. Reading, Mass.: Addison-
Wesley, 1973.
. Knuth, Donald E. The Art of Computer Programming, Vol. 3. Reading, Mass.: Addison-Wesley,
1973.
. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
. Moon, John Wesley. Counting Labelled Trees. Canadian Mathematical Congress, Montreal,
Canada, 1970.
. Polya, George. ““Kombinatorische Anzahlbestimmungen fiir Gruppen, Graphen und Chemis-
che Verbindungen.” Acta Mathematica 68 (1937): pp. 145-234.
. Shier, Douglas R. “Spanning Trees: Let Me Count the Ways.” Mathematics Magazine 73
(2000): pp. 376-381.
tries a, and @, 41/2), for each 1 < i <n/2. For the resulting 2*~'
SUPPLEMENTARY EXERCISES ordered pairs, merge sort the ith and (i + (n/4))-th ordered
pairs, for each | <i <n/4. Now do a merge sort on the 7th
and (i + (7/8))-th ordered quadruples, for each 1 <i <n/8.
1. LetG = (V, E) bea loop-free undirected graph with |V| = Continue the process until the elements of L are in ascending
n, Prove that G is a tree if and only if P(G, A) = A(A — 1)""1. order.
a) Apply this sorting procedure to the list
2. A telephone communication system is set up at a company
where 125 executives are employed. The system is initialized L: 11, 3, 4, 6, —5, 7, 35,
by the president, who calls her four vice presidents. Each vice
president then calls four other executives, some of whom in turn
—2, 1, 23, 9, 15, 18, 2, —10, 5.
call four others, and so on. (Each executive who does make a
call will actually make four calls.) b) If = 2*, how many comparisons at most does this pro-
a) How many calls are made in reaching all 125 execu- cedure require?
tives? 5. Let G=(V, E) be a loop-free undirected graph. If
b) How many executives, aside from the president, are deg(v) > 2 for all v € V, prove that G contains a cycle.
required to make calls?
6. Let T = (V, E) be a rooted tree with root r. Define the re-
3. Let T be a complete binary tree with the vertices of T
lation& on V byx ‘Ky, forx, y € V, ifx = y orif x is on the
ordered by a preorder traversal. This traversal assigns the label
path from r to y. Prove that & is a partial order.
1 to all internal vertices of T and the label 0 to each leaf. The
sequence of 0’s and 1’s that results from the preorder traversal
7. Let T = (V, E) be a tree with V = {v,, v2,..., U_}, for
of T is called the tree’s characteristic sequence.
n > 2. Prove that the number of pendant vertices in T is equal
a) Find the characteristic sequence for the complete binary to
tree shown in Fig. 12.17.
b) Determine the complete binary trees for the character-
istic sequences
2+ S° (deg(v,) — 2).
deg(v, )=3
i) 1011001010100 and
ii) 1011110000101011000. 8. Let G = (V, E) bea loop-free undirected graph. Define the
c) What are the last two symbols in the characteristic se- relation & on E as follows: If e;, e. € E, thene, ‘2 e2 ife, = &
quence for all complete binary trees? Why? or if e; and e2 are edges of acycle C inG.
4. For ke Z*, let n = 2‘, and consider the list L: a), a, a) Verify that & is an equivalence relation on E.
a3,..., G,. To sort L in ascending order, first compare the en- b) Describe the partition of E induced by &.
626 Chapter 12 Trees
G2
{a)
Figure 12.46
9. If G = (V, E£) is a loop-free connected undirected graph The first six rooted Fibonacci trees are shown in Fig. 12.47:
and a, b € V, then we define the distance from a to b (or from a) Forn > 1, let ,, count the number of leaves in 7,,. Find
b to a), denoted d(a, b), as the length of a shortest path (in G) and solve a recurrence relation for @,,.
connecting a and b. (This is the number of edges in a shortest
b) Let i, count the number of internal vertices for the
path connecting a and } and is 0 when a = Bb.)
tree 7,,, where n > 1. Find and solve a recurrence relation
For any loop-free connected undirected graph G = (V, E),
for i,.
the square of G, denoted G?, is the graph with vertex set V
(the same as G) and edge set defined as follows: For distinct c) Determine a formula for v,, the total number of vertices
a,beV, {a, b} is an edge in G? if d(a, b) < 2 (in G). In parts in T,, wheren > 1.
(a) and (b) of Fig. 12.46, we have a graph G and its square.
12. a) The graph in part (a) of Fig. 12.48 has exactly one
a) Find the square of the graph in part (c) of the figure. spanning tree—namely, the graph itself. The graph in
b) Find G? if G is the graph K,.3. Fig. 12.48(b) has four nonidentical, though isomorphic,
c) If G is the graph K,,, for n > 4, how many edges are spanning trees. In part (c) of the figure we find three of
added to G in order to construct G2? the nonidentical spanning trees for the graph in part (d).
Note that 7) and 7; are isomorphic, but 7; is not isomor-
d) For any loop-free connected undirected graph G, prove
phic to 7> (or 7;). How many nonidentical spanning trees
that G* has no articulation points.
exist for the graph in Fig. 12.48(d)?
10. a) Let T = (V, E) be a complete 6-ary tree of height 8.
b) In Fig. 12.48(e) we generalize the graphs in parts (a),
If T is balanced, but not full, determine the minimum and
(b), and (d) of the figure. For each n € Z", the graph G,, is
maximum values for |V |.
Kn.
b) Answer part (a) if T = (V, E) is a complete m-ary tree If ¢, counts the number of nonidentical spanning trees
of height A. for G,,, find and solve a recurrence relation for f,.
11. The rooted Fibonacci trees T,,n > 1, are defined recur-
sively as follows: 13. Let G=(V, E) be the undirected connected “ladder
graph” shown in Fig. 12.49. Forn > 0, let a, count the number
1) 7; is the rooted tree consisting of only the root; of spanning trees of G, whereas b, counts the number of these
2) T> is the same as 7; — it too is a rooted tree that consists spanning trees that contain the edge {x,, yi}.
of a single vertex; and a) Explain why a, = @,_) + dy.
3) For n > 3, T, is the rooted binary tree with 7,,_, as its b) Find an equation that expresses b, in terms of a, —, and
left subtree and 7,2 as its right subtree. By}.
T> Ty
Figure 12.47
Supplementary Exercises 627
a a a a a
1 1 2 (y
1 3.41 ( 3 (vy
1 3
b b b b b
(a) (b} (c) T, Ty T3
a a
| 3 1 n
b b
(d) (e)
Figure 12.48
c) Use the results in parts (a) and (b) to set up and solve a a) How many maximal independent sets of vertices are
recurrence relation for a,,. there for each of the caterpillars in parts (i) and (ii) of
Fig. 12.50?
x x2 x3 b) Fora € Z*, withn > 3, leta, count the number of maxi-
mal independent sets in a caterpillar 7 whose spine contains
n vertices. Find and solve a recurrence relation for a,,. [The
reader may wish to reexamine part (a) of Supplementary
Exercise 21 in Chapter 11.]
Ny Yo 3 Yn~1 Vn
Figure 12.49
Vy V3
14. Let T = (V, E) be a tree where |V| = v and |E| = e. The
tree T is called graceful if it is possible to assign the labels V2 V4
{1,2,3,..., v} to the vertices of 7 in such a manner that the
induced edge labeling — where each edge {i, j} is assigned the
label |i — j|, fori, 7 € {1, 2,3,..., v}, i # j — results in the
0) Spine = (Vy, Vo, V3, V4!
e edges being labeled by 1, 2,3,..., e.
a) Prove that every path on n vertices, n > 2, is graceful.
b) Forn € Z*, n > 2, show that K,,, is graceful.
c) If7 = (V, E) isatree with4 < |V| < 6, show that T is Wy W3 We
graceful. (It has been conjectured that every tree is grace- Ww2 W4
ful.)
15. For an undirected graph G = (V, E) a subset of J of V is
called independent when no two vertices in / are adjacent. If, (ii) Spine = \W4, W 2, W3, W4, Ws;
in addition, 7 U {x} is not independent for eachx € V — J, then
we say that J is a maximal independent set (of vertices). Figure 12.50
The two graphs in Fig. 12.50 are examples of special kinds
of trees called caterpillars. In general, a tree T = (V, E) is a
caterpillar when there is a (maximal) path p such that, for all 16. In part (i) of Fig. 12.51 we find a graceful labeling of the
v € V, either v is on the path p or v is adjacent to a vertex on caterpillar shown in part (i) of Fig. 12.50. Find a graceful label-
the path p. This path p is called the spine of the caterpillar. ing for the caterpillars in part (ii) of Figs. 12.50 and 12.51.
628 Chapter 12 Trees
19. For n > 0, we want to count the number of ordered rooted
trees onn + | vertices. The five trees in Fig. 12.52(a) cover the
case forn = 3.
{Note: Although the two trees in Fig. 12.52(b) are distinct as
binary rooted trees, as ordered rooted trees they are considered
the same tree and each is accounted for by the fourth tree in
Fig. 12.52(a).]
a) Performing a postorder traversal of each tree in
Fig. 12.52(a), we traverse each edge twice — once going
down and once coming back up. When we traverse an
edge going down, we shall write “1” and when we traverse
one coming back up, we shall write “—1.” Hence the post-
(ii) order traversal for the first tree in Fig. 12,52(a) generates
the list 1, 1, 1, -1, -1, —1. The list 1, 1, -1, -1,1, -1
Figure 12.51
arises for the second tree in part (a) of the figure. Find the
corresponding lists for the other three trees in Fig. 12.52(a).
17. Develop an algorithm to gracefully label the vertices of a
b) Determine the ordered rooted trees on five vertices that
caterpillar with at least two edges.
generate the lists: () 1, —1, 1, 1, -1, 1, —1, —1; qi) 1, 1,
18. Consider the caterpillar in part (1) of Fig. 12.50. If we label —1,—1, 1, 1, —1, —1; and (iii) 1, —1, 1, —1, 1, 1, —1, —1.
each edge of the spine with a | and each of the other edges How many such trees are there on five vertices?
with a 0, the caterpillar can be represented by a binary string.
c) For n > 0, how many ordered rooted trees are there for
Here that binary string is 10001001 where the first 1 is for the
n+ 1 vertices?
first (left-most) edge of the spine, the next three 0’s are for the
(nonspine) edges at v2, the second | is for edge {v2, v3}, the two 20. For n > 1, let t,, count the number of spanning trees for the
0’s are for the (nonspine) leaves at v3, and the final 1 accounts fanonn + 1 vertices. The fan forn = 4 is shown in Fig. 12.53.
for the third (right-most) edge of the spine. a) Show that t,4; = t+ ean t,, where n > | andfy = 1.
We also note that the reversal of the binary string b) For n > 2, show that t,41 = 3t, — t)-1.
10001001 — namely, 10010001 — corresponds with a second
c) Solve the recurrence relation in part (b) and show that
caterpillar that is isomorphic to the one in part (i) of Fig. 12.50.
forn > 1, t, = Fo, the 2nth Fibonacci number.
a) Find the binary strings for each of the caterpillars in
part (ii) of Figs. 12.50 and 12.51.
b) Can a caterpillar have a binary string of all 1’s?
c) Can the binary string for a caterpillar have only two 1’s?
d) Draw all the nonisomorphic caterpillars on five vertices.
For each caterpillar determine its binary string. How many
of these binary strings are palindromes? 1 2 34
e) Answer the question posed in part (d) upon replacing
Figure 12.53
“five” by “six.”
f) For n > 3, prove that the number of nonisomorphic 21. a) Consider the subgraph of G (in Fig. 12.54) induced by
caterpillars on n vertices is (1/2)(2"~3 + 2!"-9/71) = the vertices a, b, c,d. This graph is called a kite. How many
2r-4 4 2-2] = Qn-4 4 Qln/2)-2, (This was first estab- nonidentical (though some may be isomorphic) spanning
PALATES
lished in 1973 by F. Harary and A. J. Schwenk.) trees are there for this kite?
(a) (b)
Figure 12.52
Supplementary Exercises 629
b) How many nonidentical (though some may be isomor-
phic) spanning trees of G do not contain the edge {c, h}?
c) How many nonidentical (though some may be isomor-
phic) spanning trees of G contain all four of the edges {c, A},
{g, k}. {, p}, and {d, o}?
d) How many nonidentical (though some may be isomor-
phic) spanning trees exist for G?
e) We generalize the graph G as follows. For n > 2, start
with a cycle on the 2n vertices uv), v2,..., Va—1, Van.
Replace each of the n edges {v1, v2}, {v3, va}, ...,
{V2_—1, U2, } with a (labeled) kite so that the resulting graph
is 3-regular. (The case for n = 4 appears in Fig. 12.54.)
(G) n How many nonidentical (though some may be isomorphic)
spanning trees are there for this graph?
Figure 12.54
13
Optimization
and Matching
Us the structures of trees and graphs, the final chapter for this part of the text in-
troduces techniques that arise in the area of mathematics called operations research.
These optimization techniques can be applied to graphs and multigraphs that have a pos-
itive real number (in Sections 13.1 and 13.2) or a nonnegative integer (in Section 13.3),
called a weight, associated with each edge of the graph or multigraph. These numbers relate
information such as the distance between the vertices that are the endpoints of the edge, or
perhaps the amount of material that can be shipped from one vertex to another along an edge
that represents a highway or air route. With the graphs providing the framework, the opti-
mization methods are developed in an algorithmic manner to facilitate their implementation
on a computer. Among the problems we analyze are the determinations of:
1) The shortest distance between a designated vertex vg and each of the other vertices
in a loop-free connected directed graph.
2) Aspanning tree for a given graph or multigraph, where the sum of the weights of the
edges in the tree is minimal.
3) The maximum amount of material that can be transported from a starting point (the
source) to a terminating point (the sink), where the weight of an edge indicates its
capacity for handling the material being transported.
13.1
Dijkstra’s Shortest-Path Algorithm
We start with a loop-free connected directed graph G = (V, E). Now toeachedgee = (a, b)
of this graph, we assign a positive real number called the weight of e. This is denoted by
wt(e), or wt(a, b). Ifx, y € V but (x, y) ¢ E, we define wt(x, y) = oo.
For each e = (a, b) € E, wt(e) may represent (1) the length ofa road from a to b, (2) the
time it takes to travel on this road from a to b, or (3) the cost of traveling from a to b on
this road.
Whenever such a graph G = (V, E£) is given with the weight assignments described
here, the graph is referred to as a weighted graph.
In Fig. 13.1 the weighted graph G = (V, E) represents travel routes between certain pairs
EXAMPLE 13.1
of cities. Here the weight of each edge (x, y) indicates the approximate flying time for a
direct flight from city x to city y.
631
632 Chapter 13 Optimization and Matching
Figure 13.1
In this directed graph there are situations where wt(x, y) # wt(y, x) for certain edges
(x, y) and (y, x) in G. For example, wt(c, f) = 6 # 7 = wt(f, c). Perhaps this is due to
tailwinds. As a plane flies from c to f, the plane may be assisted by tailwinds that, in turn,
slow it down when it is flying in the opposite direction (from f toc).
We see that c, g € V but (c, g), (g, c) ¢ E, so wt(g, c) = wt(c, g) = o. This is also
true for other pairs of vertices. On the other hand, for certain pairs of vertices such asa, f,
we have wt(a, f) = 00 whereas wt(f, a) = 11, a finite number.
Our objective in this section has two parts. Given a weighted graph G = (V, E), for
each e = (x, y) € E, we shall interpret wt(e) as the length of a direct route (whether by
automobile, plane, or boat) from x to y. For a, b € V, suppose that vj, v2,..., UU, EV
and that the edges (a, v1), (vy, V2), -.., (Un, b) provide a directed path (in G) from a
to b. The length of this path is defined as wt(a, v;) + wt(v, v2) +--+ + Wt(v,, 0). We
write d(a, b) for the (shortest) distance from a to b—that is, the length of a shortest
directed path (in G) from a to b. If no such path exists (in G) from a to b, then we define
d(a, b) = oo. And for alla € V,d(a, a) = 0. Consequently, we have the distance function
d:V X V > Rt U (0, oo}.
Now fix vo € V. Then for all v € V, we shall determine
1) d(vo, v); and
2) a directed path from vg to uv [of length d(vo, v)] if d(vo, v) 1s finite.
To accomplish these objectives, we shal! introduce a version of the algorithm that was
developed by Edsger Wybe Dijkstra (1930-2002) in 1959. This procedure is an example of
a greedy algorithm, for what we do to obtain the best result /ocally (for vertices “close” to
vo) turns out to be the best result globally (for all vertices of the graph).
Before we state the algorithm, we wish to examine some properties of the distance
function d. These properties will help us understand why the algorithm works.
With vp € V fixed (as it was earlier), let S C V with vo € S, and S = V —S. Then we
define the distance from vg to S by
d(vo, S) = min{d(vo, v)}.
ves
When d(up, S) < 00, thend (vy, S) is the length of a shortest directed path from Ug to a vertex
in §. In this case there will exist at least one vertex v»+) in S with d(vg, S) = d(vo, Vn):
13.1 Dijkstra’s Shortest-Path Algorithm 633
Here P: (vo, v1), (V1, V2), .--, Um—1. Um), (Um, Un+1) is a shortest directed path (in G)
from vo to V,_ +1. SO, at this point, we claim that
1) vo, Vt, V2, ..., Um € S; and
2) P’: (vo, v1), (U1, U2), -.. . (Ug—1, Vg) iS a Shortest directed path (in G) from vg to vz,
foreach 1 <k <m.
(The proofs for these two results are requested in the first exercise at the end of this section.)
From these observations it follows that
d(vo, S) = min{d(vp, u) + wt(u, w)},
where the minimum is evaluated over all u € S, w € S. Ifa minimum occurs for u = x and
w = y, then
d(vo, y) = d(vo, x) + wtx, y)
is the (shortest) distance from vg to y.
The formula for d(vo, 5) is the cornerstone of the algorithm. We start with the set
So = {vo} and then determine
d(vo, So) = min {d(vp, u) + wt(u, w)}.
Wwe So
This gives us d(vo, So) = min,,-%, {wt(vo, w)} since Sp = {vo} and d(vp, vo) = O.Ifv, € So
and d(vp, So) = wt(vo, v), then we enlarge Sp to S; = So U {u;} and determine
d(vg, Sy) = min {d(vp, u) + wt(u, w)}.
HE!
ue §;
This leads us to a vertex v> in S; with d(vo, S|) = d(vo. v2). Continuing the process, if
S; = {vo, v1, vo, ..., v;} has been determined and v;+.; € S; with d(vo, u;41) = d(vo, S:),
then we enlarge S; to S;.; = S; U {vj41}. We stop when we reach S,,_, = @ (wheren = |V])
or when d(vo, Si) = o© for some 0 <i <n — 2.
Throughout this process, various labels will be placed on each vertex v € V. The final
set of labels appearing on the vertices will have the form (L(v), u), where L(v) = d(vp, v),
the distance from vp to v, and u is the vertex (if one exists) that precedes v along a shortest
path from vg to v. That is, (4, v) is the last edge in a directed path from vo to v, and this
path determines d(vp, v). At first we label vo with (0, —) and all of the other vertices v
with the label (oo, —). As we apply the algorithm, the label on each v # vp will change
(sometimes more than once) from (co, —) to the final label (L(v), «) = (d(vo, v), 4), unless
d(vp, v) = co.
Now that these preliminaries are behind us, it is time to formally state the algorithm.
Let G = (V, E) be a weighted graph, with |V| = n. To find the shortest distance from a
fixed vertex vo to all other vertices in G, as well as a shortest directed path for each of these
vertices, we apply the following algorithm.
Dijkstra’s Shortest-Path Algorithm
Step 1: Set the counter i = 0 and So = {ug}. Label vp with (0, —) and each v # vp
with (co, —).
ifn = 1, then V = {vo} and the problem is solved.
ifn > 1, continue to step (2).
634 Chapter 13 Optimization and Matching
Step 2: For each v € S; replace, when possible, the label on v by the new label
(L(v), y) where ©
L(v) = min{L(v), LQ) + wttu, v)},
and y is a vertex in §; that produces the minimum L(v). [When a replacement does
take place, it is due to the fact that we can go from vg to v and travel a shorter distance
by going along a path that includes the edge (y, v).]
Step 3: If every vertex in 5; (for some 0 < i <n — 2) has the label (00, —), then the
labeled graph contains the information we are seeking.
If not, then there is at least one vertex v € 5; that is not labeled by (co, —), and
we perform the following tasks:
1) Select a. vertex vj; where L(v,+1) is a minimum (for all such v).
There may be more than one such vertex, in which case we are free to
choose among the possible candidates, The vertex vj is an element
of §; that is closest to v9.
2) Assign 5; U {vj24} to Sj44.
3) Increase the counter i by 1.
if i = n — 1, the labeled graph contains the information we want.
Ifi <n — 1, return to step (2).
We now apply this algorithm in the following example.
Apply Dijkstra’s algorithm to the weighted graph G = (V, E) shown in Fig. 13.1 in order
EXAMPLE 13.2
to find the shortest distance from vertex c (= ug) to each of the other five vertices in G.
Initialization: (¢ = 0). Set Sp = {c}. Label c with (0, —) and all other vertices
in G with (co, —).
First Iteration: (So = {a, b, f, g, h}). Here i = 0 in step (2) and we find, for
example, that
L({a) = min{L(a), L(c) + wt(c, a)}
= min{oo, 0+ co} = on,
whereas
L(f) = min{L(f), L(c) + wt(c, f)}
= min{oo, 0+ 6} = 6.
Similar calculations yield L(b) = L(g) = oo and L(h) = 11. So
we label the vertex f with (6, c) and the vertex / with (11, c). The
other vertices in Sy remain labeled by (ox, —). [See Fig. 13.2(a).]
In step (3) we see that f is the vertex v; in Sg closest to vg, So we
assign to S; the set So U { f} = {c, f} and increase the counterf
to 1. Since i = 1 < 5 (= 6 — 1), we return to step (2).
Second Iteration: (S; = {a, b, g, h}). Now i = 1 in step (2), and for each v € S|
we set
Liv) = min{L(v), L(u) + wt(u, v)}.
13.1 Dijkstra’s Shortest-Path Algorithm 635
Figure 13.2
This yields
L(a) = min{L(a), L(c) + wt(e, a), L(f) + wtf, a)}
= min{fow,0+ o0,6+4+ 11} = 17,
so vertex a is labeled (17, f). In a similar manner, we find
L(b) = min{oo, 0+ co, 6+ co} = ~,
L(g) = min{oo, 0+ cw, 64 9} = 15,
L(h) = min{11, 0+ 11, 6 + 4} = 10.
[These results provide the labeling in Fig. 13.2(b).] In step (3) we
find that the vertex v2 is h because A € S; and L(h) is a minimum.
Then S2 is assigned S; U {h} = {c, f, h}, the counter is increased
to 2, and since 2 < 5, the algorithm directs us back to step (2).
Third Iteration: (Sy = {a, b, g}). With i = 2 in step (2) the following are now
computed:
L(a) = min{L (a), L{u) + wttu, a)}
= min{17,0+ 0,64 11, 10+ 11} =17
(so the label on a is not changed);
L(b) = min{oo,
0+ co, 6+ co, 10+ co} = co
(so the label on b remains oo); and
L(g) = min{15,0+ 00,64 9, 10+ 4} = 14 < 15,
so the label on g is changed to (14, #) because 14 = L(h) +
wt(h, g). Among the vertices in Sx, g is the closest to vg since
L(g) is a minimum. In step (3), vertex v3 is defined as g and
53 = So U {g} = {c, f, 2, g}. Then the counter 7 is increased to
3 < 5, and we return to step (2).
Fourth Iteration: (S3 = {a, b}). With i = 3, the following are determined in step
(2): L(a) = 17; L(b) = oo. (Thus no labels are changed during
636 Chapter 13 Optimization and Matching
this iteration.) We set vg = a and S4 = $3 U {a} = {c, fl h. g, a}
in step (3). Then the counter 7 is increased to 4 (< 5), and we
return to step (2).
Fifth Iteration: (5S, = {b}). Here i = 4 in step (2), and we find L(b) = L(a) +
wt(a, b) = 17+ 5 = 22. Now the label onb is changed to (22, a).
Then vs = binstep (3), Ss issetto{c, f, h, g, a, b}, and? is incre-
mented to 5. Butnow thati = 5 = |V| — 1, the process terminates.
We reach the labeled graph shown in Fig. 13.3.
Figure 13.3
From the labels in Fig. 13.3 we have the following shortest distances from c to the other
five vertices in G:
1) d(c, f) = L(f)
= 6. 2) d(c, h) = L(h)= 10.
3) d(c, g) = L(g) = 14. 4) d(c, a) = L(a) = 17.
5) d(c, b) = L(b) = 22.
To determine, for example, a shortest directed path from c to b, we start at vertex b,
which is labeled (22, a). Hence a is the predecessor of b on this shortest path. The label on
a is (17, f), so f precedes a on the path. Finally, the label on f is (6, c), so we are back
at vertex c, and the shortest directed path from c to b determined by the algorithm is given
by the edges (c, f), (f, a), and (a, b).
Now that we have demonstrated one application of this algorithm, our next concern is
the order of its worst-case time-complexity function f(m), where n = |V| in the weighted
graph G = (V, E). We shall estimate the worst-case complexity in terms of the number
of additions and comparisons that are made in steps (2) and (3) during execution of the
algorithm.
Following the initialization process in step (1), there are at most n — 1 iterations because
each iteration determines the next closest vertex to vg and — 1 = |V — {vo}|.
If 0 <i <n — 2, then in step (2) for that iteration [the (¢ + 1)st], we find that the fol-
lowing takes place for each v € S;:
1) When 0 <i <n — 2, we perform at most n — 1| additions to calculate
Liv) = min{L(v), L{u) + wt(u, v)}
— one addition for each u € §;.
13.1 Dijkstra’s Shortest-Path Algorithm 637
2) We compare the present value of L(v) with each of the (possibly infinite) numbers
L(v) + wt(u, v) — one for each u € S;, where|S;| <1” — 1—in order to determine
the updated value of L(v). This requires at most n — 1 comparisons. Therefore, before
we get to step (3) we have performed at most 2(n — 1) steps for each v € S; —a total
of at most 2(n — 1)* steps for all v € S;.
Continuing to step (3), we now must select the minimum from among at most
n — 1 numbers L(v), where v € S;. This requires n — 2 additional comparisons
— in
the worst case.
Consequently, each iteration needs no more than 2(n — 1)? + (n — 2) steps in all.
It is possible to have as many as n — | iterations, so it follows that
fn) <(n— D[2(n — 1)? + (n — 2) € O(n’).
We shall close this section with some observations that can be used to improve the worst-
case time-complexity of this algorithm. First we should observe that for 0 <i <n — 2, the
(i + 1)st iteration of our present algorithm generated the (i + 1)st closest vertex to vo. This
was the vertex v;4;. In our example we found v; = f, v2 = h, v3 = g, vg = a, and vs = b.
Second, note how much duplication we had when computing L(v). This is seen quite
readily in the second and third iterations of Example 13.2. We should like to cut back on
such unnecessary calculations, so let us try a slightly different approach to our shortest-
path problem. Once again we start with a weighted graph G = (V, EF) with |V| =” and
vo € V. We shall now let v; denote the ith closest vertex to ug, where 0 <i <n —1, §; =
{vp, U1,..-, U;}, and S; = V — S;. At the start we assign to each v € V the number Lo(v)
as follows:
Lo(vo) = 0 because d{vg, U9) =O and
Lo(v) =o, forv F vo.
Then fori > O and v € Si, we define
Lizi(v) = min{L;(v), Li (v;) + wt(v;, v)},
where v, is a vertex for which L;(v;) is minimal: a vertex that is ith closest to vp. We find
that
Li4i(v) = min {d (vo, vj) + wt(v;, v)}.
Now let us see what happens at each of the (at most) m — 1 iterations when we employ
the definition of L,.;{v) that uses the vertex v;.
For each v € S; we need only one addition [namely, L;(v;) + wt(v;, v)] and one compar-
ison [between L;(v) and £;(v;) + wt(v,, v)] in order to compute L;+4)(v). Since there are at
most n — | vertices in S;, this necessitates at most 2(n — 1) steps to obtain L;4)(v) for all
v © S,. Finding the minimum of {L;41(v)|v € S;} requires at most n — 2 comparisons, so
at each iteration we can obtain v;,; —a vertex v € S; where Lj41(v) is a minimum
— in at
most 2(”n — 1) + (n — 2) = 3n — 4 steps. We perform at most n — 1 iterations, so we find
for this version of Dijkstra’s algorithm, that the worst-case time-complexity is O(n7).
In order to find a shortest path from vp to each v € V, v # vo, we see that whenever
Li4i(v) < L;(v), for any 0 <i <n — 2, we need to keep track of the vertex y € S; for
which L;41(v) = d(vo, y) + wt(y, v).
Other implementations of Dijkstra’s algorithm use a data structure called a heap. For a
weighted graph G = (V, E), where |V| =n and |E| = m, we find, for example, that the
binary heap implementation of this algorithm has worst-case time-complexity O(m log, n).
(This, and much more, is discussed on pp. 108-122 of the text by R. K. Ahuja, T. L. Magnanti,
638 Chapter 13 Optimization and Matching
and J. B. Orlin [2]. The reader can also find more about various kinds of heaps on pp. 773-
787 of this text. Another source for the implementation and running-time of Dijkstra’s
algorithm is Section 24.3 (pp. 595-601) of the text by T. H. Cormen, C. E. Leiserson, R. L.
Rivest, and C. Stein [7].)
EXERCISES 13.1
1. Let G = (V, E) be a weighted graph, where for each edge
e = (a, b) in E, wt(a, b) equals the distance from a to along
edge e. If (a, b) ¢ E, then wt(a, b) = ox.
Fix vg €V and let SCV, with vp €S. Then for S =
V —S we define d(vp, S) = min,-s{d(vo, v)}. If Umar ES
and d(v, S) =d(vo, Um+i), then P: (v9, v1), (Vy, U2), ..-,
(Um—1; Um}, (Um, Um+1) 18 a shortest directed path (in G) from
Ug tO Um+4). Prove that f
a) vo, Vv), v2, see Um—l> Vm € Ay
Figure 13.4
b) P’: (ve, v1}, (U1, V2}, .-. , (Ue_1, Ue) IS a shortest di-
rected path (in G) from vo to 1, foreach 1 < k <m. 4. Use the ideas developed at the end of the section to con-
firm the result obtained in (a) Example 13.2; and (b) part (a) of
2. a) Apply Dijkstra’s algorithm to the weighted graph G =
Exercise 2.
(V, E) in Fig. 13.4, and determine the shortest distance
from vertex a to each of the other six vertices in G. Here 5. Prove or disprove the following for a weighted graph
wt(e) = wt(x, y} = wt(y, x) for each edge e = {x, y} in E. G = (V, E), where V = {vo, v1, v2, ..., ¥,} and e, € E with
wt(e,) < wt(e) for all e € E, e # e,. If Dijkstra’s algorithm is
b) Determine a shortest path from vertex a to each of the
applied to G, and the shortest distance d(vp, v,) is computed
vertices c, f, andi.
for each vertex v,, | <i <n, then there exists a vertex v,, for
3. a) Apply Dijkstra’s algorithm to the graph shown in some | < j <n, where the edge e, is used in the shortest path
Fig. 13.1 and determine the shortest distance from vertex from vo to v,.
a to each of the other vertices in the graph.
b) Find a shortest path from vertex a to each of the vertices
f.g,andh.
13.2
Minimal Spanning Trees:
The Algorithms of Kruskal and Prim
A loosely coupled computer network is to be set up for a system of seven computers. The
graph G in Fig. 13.5 models the situation. The computers are represented by the vertices
in the graph; the edges represent transmission lines that are being considered for linking
certain pairs of computers. Associated with each edge e in G is a positive real number wt(e),
the weight of e. Here the weight of an edge indicates the projected cost for constructing that
particular transmission line. The objective is to link all the computers while minimizing
the total cost of construction. To do so requires a spanning tree 7, where the sum of the
weights of the edges in 7 is minimal. The construction of such an optimal spanning tree can
be accomplished by using the algorithms that were developed by Joseph Bernard Kruskal
Figure 13.5 (1928-— ) and Robert Clay Prim (1921— ).
Like Dijkstra’s algorithm, these algorithms are greedy; when each is used, at each step
of the process an optimal (here minimal) choice is made from the remaining available data.
Once again, if what appears to be the best choice /ocally (for example, for a vertex c and
13.2. Minimal Spanning Trees: The Algorithms of Kruskal and Prim 639
the vertices near c) turns out to be the best choice globally (for all vertices of the graph),
then the greedy algorithm will lead to an optimal solution.
We first consider Kruskal’s algorithm. This algorithm is given as follows.
Let G = (V, E) be a loop-free undirected connected graph, where |V| = and each
edge e is assigned a positive real number wt(e). To find an optimal (minimal) spanning tree
for G, apply the following algorithm.
Kruskal's Algorithm
Step 1: Set the counter i = 1 and select an edge ¢; in G, where wt(e;) is as small as
possible.
Step 2: For 1 <i <n — 2, if edges e;, e2,..., ¢; have been selected, then select
edge e;,; from the remaining edges in G so that (a) wt(e;+;) is as small as possible
and (b) the subgraph of G determined by the edges @), ¢2,..., &;, e141 (and the
vertices they are incident with) contains no cycles.
Step 3: Replace i by i + 1.
if i = n — 1, the subgraph of G determined by edges e;, @2,..., &,—1 is con-
nected with » vertices and n — 1 edges, and is an optimal spanning tree for G.
ifi <n — 1, return to step (2).
Before establishing the validity of the algorithm, we consider the following example.
Apply Kruskal’s algorithm to the graph shown in Fig. 13.5.
EXAMPLE 13.3
Initialization: (¢ = 1). Since there is a unique edge — namely, {e, g} — of small-
est weight 1, start with T = {{e, g}}. (7 starts as a tree with one
edge, and after each iteration it grows into a larger tree or forest.
After the last iteration the subgraph 7 is an optimal spanning tree
for the given graph G.)
First Iteration: Among the remaining edges in G, three have the next smallest
weight 2. Select {d, f}, which satisfies the conditions in step (2).
Now T is the forest {{e, g}, {d, f}}, andi is increased to 2. With
i = 2 <6, return to step (2).
Second Iteration: Two remaining edges have weight 2. Select {d, e}. Now T is the
tree {{e, g}, {d, f}, {d, e}}, and i increases to 3. But because
3 < 6, the algorithm directs us back to step (2).
Third Iteration: | Among the edges of G that are not in 7, edge {f, g} has min-
imal weight 2. However, if this edge is added to T, the result
contains a cycle, which destroys the tree structure being sought.
Consequently, the edges {c, e}, {c, g}, and {d, g} are consid-
ered. Edge {d, g} brings about a cycle, but either {c, e} or {c, g}
satisfies the conditions in step (2). Select {c, e}. T grows to
{{e, 2}, {d, f}. {d, e}, {c, e}} and i is increased to 4. Returning
to step (2), we find that the fourth and fifth iterations provide the
following.
640 Chapter 13 Optimization and Matching
Fourth Iteration: T = {{e, gz}, {d, f}, {d, e}, {c. e}, {b, e}}; ( increases to 5.
Fifth Iteration: T = {{e, g}. {d, fF}, {d, e}, {c. e}. {b, e}, {a, b}}. The counter 7
now becomes 6 = (number of vertices in G) — 1. So T is an
optimal tree for graph G and has weight 1+2+2+4+3+4+
5=17.
Figure 13.6 shows this spanning tree of minimal weight.
Figure 13.6
Example 13.3 demonstrates that Kruskal’s algorithm does generate a spanning tree.
This follows from parts (a) and (d) of Theorem 12.5 since the resulting subgraph has n
(= |V|) vertices and n — 1 edges and is connected. In general, if G = (V, E) is a loop-
free weighted connected undirected graph and 7 is the subgraph of G that is generated by
Kruskal’s algorithm, then 7 has no cycles. Furthermore, 7 is a spanning subgraph of G.
For if v € V and v is not in 7, then we can add an edge e of G to T where e is incident
with v — and the resulting subgraph of G still contains no cycles. Finally, T 1s connected.
Otherwise T has at least two components, say 7; and 72, and since G is connected we
could add to T an edge {x, y} from G where x is in 7; and y is in 7) —and no cycle would
be present in this subgraph. Consequently, the subgraph T of G is a connected spanning
subgraph of G with no cycles (or loops), so 7 is a spanning tree of G.
The algorithm is greedy; it selects from the remaining edges an edge of minimal weight
that doesn’t create a cycle. The following result guarantees that the spanning tree obtained
is optimal.
THEOREM 13.1 Let G = (V, E) be a loop-free weighted connected undirected graph. Any spanning tree
for G that is obtained by Kruskal’s algorithm is optimal.
Proof: Let |V| = , and let T be a spanning tree for G obtained by Kruskal’s algorithm. The
edges in T are labeled e), e2, .. . , €n—1, according to the order in which they are generated
by the algorithm. For each optimal tree T’ of G, define d(7’) = k if k is the smallest positive
integer such that T and T’ both contain e), e2...., ex—1, but ex ¢ T’.
Let 7; be an optimal tree for which d(7|) = r is maximal. Ifr = n, then T = T; and the
result follows. Otherwise, r <n — 1 and adding edge e, (of T) to 7; produces the cycle C,
where there exists an edge e? of C that is in 7) but not in T.
Start with tree T,. Adding e, to 7; and deleting e’, we obtain a connected graph with n
vertices and n — 1 edges. This graph is a spanning tree, 72. The weights of 7; and 7> satisfy
wt(7>) = wt(T;) + wt(e,) — wt(e’).
Following the selection of e;, e2, ..., ¢y—1 in Kruskal’s algorithm, the edge e, is chosen
so that wt(e,) is minimal and no cycle results when e, is added to the subgraph H of G
determined by e;, €2,..., €-~1. Since e, produces no cycle when added to the subgraph
H, by the minimality of wt(e,) it follows that wt(e,) > wt(e,). Hence wt(e,) — wt(e,) < 0,
so wt(T>) < wt(7}). But with 7, optimal, we must have wt(7>) = wt(7)), so T> is optimal.
The tree 7> is optimal and has the edges e;, e2,..., €y-1, €y In common with T, so
d(T.) >r+1>r=d(T), contradicting the choice of T;. Consequently, 7; = T and the
tree T produced by Kruskal’s algorithm is optimal.
We measure the worst-case time-complexity for Kruskal’s algorithm by making the fol-
lowing observations. Given a loop-free weighted connected undirected graph G = (V, E),
13.2 Minimal Spanning Trees: The Algorithms of Kruskal and Prim 641
where |V| = n and |£| = m > 2, we can use the merge sort of Section 12.3 to list (and rela-
bel, if necessary) the edges in E as e€), €2,..., @m, where wt(e;) < wt(e2) <--- < wt(e,).
The number of comparisons needed to do this is O(m log, m). Then once we have the edges
of G listed in this order (of nondecreasing weights), step (2) of the algorithm is carried out
at most m — | times
— once for each of the edges e2, €3,..., Em.
For each edge e;, 2 <i <m, we must determine whether e; causes the formation of a
cycle in the tree, or forest, that we have developed (after considering the edges e), €2,...,
e;-|). This can be done for each edge in a constant [that is, O(1)] amount of time, if we
use additional data structures, such as the component flag data structure. Unfortunately, the
updating of this data structure cannot be performed in a constant amount of time. However,
it does turn out that all of the work needed for cycle detection can be carried out in at most
O(n log, n) steps."
Consequently, we shall define the worst-case time-complexity function f, form > 2, as
the sum of the following:
1) The total number of comparisons needed to sort the edges of G into nondecreasing
order, and
2) The total number of steps that are carried out in step (2) in order to detect the formation
of a cycle.
Unless G is atree, it follows that |V| = n < m = |EF| because G is connected. As a result,
nlog,n <m log, mand f € O(m log, m).
A measure in terms of 1, the number of vertices in G, can also be given. Heren — 1<m
because the graph is connected, and m < (5) = (1/2)(n)(m — 1), the number of edges in
K,. Consequently m log, m <n log, n? = 2n? log, n, and we can express the worst-case
time-complexity of Kruskal’s algorithm as O(n? log, 7), although this is less precise than
O(m log, m).
A second technique for constructing an optimal tree was developed by Robert Clay Prim.
In this greedy algorithm, the vertices in the graph are partitioned into two sets: processed
and not processed. At first only one vertex is in the set P of processed vertices, and all
other vertices are in the set N of vertices to be processed. Each iteration of the algorithm
increases the set P by one vertex while the size of set N decreases by one. The algorithm
is summarized as follows.
Let G = (V, E) be a loop-free weighted connected undirected graph. To obtain an op-
timal tree T for G, apply the following procedure.
oe ol '” Prim‘s Algorithm
Step 1: Set the counter i, = 1 and place an arbitrary vertex v; € V into set P, Define
Step2: For.l <i 1} where |V| =n, let P = {v, m2, ..., w}, T = (en, &2,
...,@-p}pand N = V~ P. Add to T a shortest edge (an edge of minimal weight)
in G that connects a vertex x in P with a vertex y (= 0,41) in N. Place y in P and
delete it from N.
‘For more on the analysis of the segment dealing with cycle detection, we refer the reader to Chapter 8 of the
text by S. Baase and A. Van Gelder [3] and to Chapter 4 of the text by E. Horowitz and S. Sahni [17].
642 Chapter 13 Optimization and Matching
Step 3: Increase the counter by 1.
. iff =n, the subgraph of G determined
by the edges ¢;, ¢2, ..., ¢,—-1 is connected
_ with n vertices and n ~ 1 edges and is an optimal tree for G.
ift <n, return to step (2).
We use this algorithm to find an optimal tree for the graph in Fig. 13.5.
Prim’s algorithm generates an optimal tree as follows.
EXAMPLE 13.4
Initialization: i=1;P= _ ;N = {b,c,d,e, f,g}; T = 9.
First Iteration: T = {{a, b}}; oa b};N = {c, d, e, f, g}; i =2.
Second Iteration: 7 = {{a, b}, bs e}};P = {a, b, e};N = {c, d, f, g}, i = 3.
Third Iteration: 7 = {{a, b}, {b, e}, {e. g}}; P = {a, b. e, g}:
N = {c, d, f\; i =4.
Fourth Iteration: 7 = {{a, b}, {b, e}. fe, g}, {d, e} = {a, b, e, g, d};
N = {c, f};i =5.
Fifth Iteration: T = {{a, b}, {b, e}, {e, gh, {d. e}, {f, ahh: P = {a, b, e, 9. d, fy:
N = {c}, i = 6.
Sixth Iteration: T = {{a, b}, {b, e}, {e, _ }, {d, e}, (f, gh, {c, 2h}:
P ={a,b,e, g,d, f,c}=V;N=6%;1=7=|V|. Hence T is
Figure 13.7 n optimal spanning he of weight 17 for G, as seen in Fig. 13.7.
Note that the minimal spanning tree obtained here differs from that in Fig. 13.6. So this
type of spanning tree need not be unique.
We shall only state the following theorem, which establishes the validity of Prim’s
algorithm. The proof is left for the reader.
THEOREM 13.2 Let G = (V, E) be a loop-free weighted connected undirected graph. Any spanning tree
for G that is obtained by Prim’s algorithm is optimal.
Note that at each iteration Prim’s algorithm always grows a tree. Some iteration(s) of
Kruskal’s algorithm may grow a forest (which is not a tree). Also observe that Prim’s
algorithm can be started at any vertex in the graph.
We conclude this section with a few words and references about the worst-case time-
complexity for Prim’s algorithm. When the algorithm is applied to a loop-free weighted
connected undirected graph G = (V, E), where |V| = 7 and |E| = m, the typical imple-
mentations require O (n*) steps. (This can be found in Chapter 7 of A. V. Aho, J. E. Hopcroft,
and J. D. Ullman [1]; in Chapter 8 of S. Baase and A. Van Gelder [3]; and in Chapter 4
of E. Horowitz and S. Sahni [17].) Other implementations of the algorithm have improved
the situation so that it requires O(m log, n) steps. (This is discussed in the articles by R. L.
Graham and P. Hell [16]; by D. B. Johnson [18]; and by A. Kershenbaum and R. Van Slyke
[19].) The worst-case time-complexities for various heap implementations are discussed in
13.2 Minimal Spanning Trees: The Algorithms of Kruskal and Prim 643
Section 13.5 of R. V. Ahuja, T. L. Magnanti, and J. B. Orlin [2] and Section 23.2 of T. H.
Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein [7].
5. a) Answer Exercise 4 under the additional requirement that
EXERCISES 13.2 the system includes a highway directly linking Evansville
and Indianapolis.
1. Apply Kruskal’s and Prim’s algorithms to determine mini-
mal spanning trees for the graph shown in Fig. 13.8. b) If there must be a direct link between Fort Wayne and
Gary in addition to the one connecting Evansville and In-
dianapolis, find the minimum number of miles of highway
a 2 b 2 C
that must be constructed.
3 3
3 3 3 6. Let G = (V, E) be a loop-free weighted connected undi-
2 2 rected graph. For n € Z*, let {e), e2,..., €,} be a set of edges
d < f (from £) that includes no cycle in G. Modify Kruskal’s al-
5 4
3 1 3 gorithm in order to obtain a spanning tree of G that is mini-
mal among all the spanning trees of G that include the edges
g 3 h 3 i C1, €2,-4+5 ne
Figure 13.8 7. a) Modify Kruskal’s algorithm to determine an optimal
tree of maximal weight.
2. Let G = W4, the wheel on four spokes. Assign the weights b) Interpret the information of Exercise 4 in terms of the
1, 1, 2, 2, 3, 3, 4, 4 to the edges of G so that (a) G has a unique number of calls that can be placed between pairs of cities
minimal spanning tree; (b) G has more than one minimal span- via the adoption of certain new telephone transmission
ning tree. lines. (Cities that are not directly linked must communi-
3. Let G = (V, E) be a loop-free weighted connected undi- cate through one or more intermediate cities.) How can the
rected graph with T = (V, E’}, a minimal spanning tree for G. seven cities be minimally connected and allow a maximum
For v, w € V, is the path from v to w in T a path of minimum number of calls to be placed?
weight in G? 8. Prove Theorem 13.2.
4, Table 13.1 provides information on the distance (in miles) 9. Let G = (V, E) be a loop-free weighted connected undi-
between pairs of cities in the state of Indiana. rected graph, where for each pair of distinct edges e,, e2 € E,
A system of highways connecting these seven cities is to be wt(e,) # wt(€2). Prove that G has only one minimal spanning
constructed. Determine which highways should be constructed tree.
so that the cost of construction is minimal. (Assume that the
cost of construction of a mile of highway is the same between
every pair of cities.)
Table 13.1
Fort South
Bloomington | Evansville | Wayne | Gary | Indianapolis | Bend
Evansville 119 — — — — —
Fort Wayne 174 290 — — — —
Gary 198 277 132 — — —
Indianapolis 51 168 121 153 — —
South Bend 198 303 79 58 140 —
Terre Haute 58 113 201 164 71 196
644 Chapter 13 Optimization and Matching
13.3
Transport Networks:
The Max-Flow Min-Cut Theorem
This section provides an application for weighted directed graphs to the flow of acommodity
from a source to a prescribed destination. Such commodities may be gallons of oil that flow
through pipelines or numbers of telephone calls transmitted in a communication system.
In modeling such situations, we interpret the weight of an edge in the directed graph as a
capacity that places an upper limit on, for example, the amount of oil that can flow through
a certain part of a system of pipelines. These ideas are expressed formally in the following
definition.
Definition 13.1 Let N = (V, E) be a loop-free connected directed graph. Then N is called a network, or
transport network, if the following conditions are satisfied:
a) There exists a unique vertex a € V with id(a), the in degree of a, equal to 0. This
vertex a is called the source.
b) There is a unique vertex z € V, called the sink, where od(z), the out degree of z,
equals 0.
c) The graph N is weighted, so there is a function from E to the set of nonnegative integers
that assigns to each edge e = (v, w) € E a capacity, denoted by c(e) = c(v, w).
EXAMPLE 13.5 The graph in Fig. 13.9 is a transport network. Here vertex a is the source, the sink is at
vertex z, and capacities are shown beside each edge. Since c(a, b) + c(a, g) =354+7=
12, the amount of the commodity being transported from a to z cannot exceed 12. With
c(d, z) + c(h, z) = 5+ 6 = 11, the amount is further restricted to be no greater than 11.
To determine the maximum amount that can be transported from a to z, we must consider
the capacities of all edges in the network.
5 5
a AS 6 2 z
7 5 6
g - h
Figure 13.9
The following definition is introduced to assist us in solving this problem.
Definition 13.2 If N =(V, E) is a transport network, a function f from EF to the nonnegative integers is
called a flow for N if
a) f(e) <c(e) for each edge e € EF; and
b) for each v € V, other than the source a or the sink z, Do .cy f(w. v) =
S- ncy £(v, w). (If there is no edge (v, w), then f(v, w) = 0.)
13.3. Transport Networks: The Max-Flow Min-Cut Theorem 645
The first property specifies that the amount of material transported along a given edge
cannot exceed the capacity of that edge. Property (b) enforces a conservation condition:
The amount of material flowing into a vertex v must equal the amount that flows out from
this vertex. This is so for all vertices except the source and the sink.
For the networks in Fig. 13.10, the label x, y on each edge e is determined so that x = c(e)
EXAMPLE 13.6
and y is the value assigned for a possible flow f. The label on each edge e satisfies f(e) <
c(e). In part (a) of the figure, the “flow” into vertex g is 5, but the “flow” out from that
vertex is 2 + 2 = 4. Hence the function f is not a flow in this case. The function f for part
(b) does satisfy both properties, so it is a flow for the given network.
D 4,1 d D 4,2 d
5, 3 - 5,2 5, 3 - 5,4
a AS,2 %6,4 42,1 z a A5,2 46,3 42,2 z
7,5 _ 6,5 7,5 - 6,4
(a) g 5,2 h (b) g 5,3 A
Figure 13.10
Definition 13.3 Let f be a flow for a transport network N = (V, E).
a) An edge e of the network is called saturated if f(e) = c(e). When f(e) < c(e), the
edge is called unsaturated.
b) Ifa is the source of N, then val(f) = Sev f (a, v) is called the value of the flow.
For the network in Fig. 13.10(b), only the edge (h, d) is saturated. All other edges are
EXAMPLE 13.7
unsaturated. The value of the flow in this network is
val(f) = )> f(a, v) = f(a, b) + fla, g) =34+5=8.
veV
But is there another flow f; such that val(f)) > 8? The determination of a maximal flow
(a flow that achieves the greatest possible value) is the objective of the remainder of this
section. To accomplish this, we observe that in the network of Fig. 13.10(b),
Yo f@ v) =34+5=8=444=
fd dt fia = >- flv, 2).
veV veV
Consequently, the total flow leaving the source a equals the total flow into the sink z.
The last remark in Example 13.7 seems like a reasonable circumstance, but will it occur
in general? To prove the result for every network, we need the following special type of
cut-set.
Definition 13.4 If N = (V, E) is a transport network and C is a cut-set for the undirected graph associated
with N, then C is called a cut, or an a-z cut, if the removal of the edges in C from the
network results in the separation of a and z.
646 Chapter 13 Optimization and Matching
Each of the dotted curves in Fig. 13.11 indicates a cut for the given network. The cut C,
EXAMPLE 13.8
consists of the undirected edges {a, g}, {b, d}, {b, g}, and {b, h}. This cut partitions the
vertices of the network into the two sets P = {a, b} and its complement P = {d, g, A. z},
so C; is denoted as (P, P). The capacity ofa cut, denoted c(P, P), is defined by
c(P, P) = > c(v, w),
veP
weP
the sum of the capacities of all edges (v, w), where v € P and w € P. In this example,
c(P, P) = cla, g) +c(b, d) + c(b, h) = 7+ 44+ 6 = 17. [Considering the directed edges
(from P to P) in the cut C; = (P, P)—namely, (a, g), (b, d), (b, h) —we find that the
removal of these edges does not result in a subgraph with two components. However, the
removal of these three edges eliminates all possible directed paths from a to z and no proper
subset of {(a, 2), (b, d), (b, #)} has this separating property. ]
b ; 4 d
5 a, 5
7 _-*
47 é
a val 6 A? Z
“7 \
C,-7 ~-~\
7 5 ‘ 6
g CG A
Figure 13.11
The cut C2 induces the vertex partition Q = {a, b, g}, O = {d, h, z} and has capacity
c(Q, QO) = c(b, d)+ c(h, hh) + c(g, hh) = 44645 = 15.
A third cut of interest is the one that induces the vertex partition S = {a, b, d, g, h},
S = {z}. (What are the edges in this cut?) Its capacity is 11.
Using the idea of the capacity of a cut, this next result provides an upper bound for the
value of a flow in a network.
THEOREM 13.3 Let f be a flow inanetwork N = (V, E).IfC = (P, P)is any cut in NV, then val( f) cannot
exceed c(P, P).
Proof: Let vertex a be the source in N and vertex z the sink. Since id(a) = 0, it follows that
for all w € V, f(w, a) = 0. Consequently,
val(f) = D0 fav) => fav) - SO fw, a).
veV veV weV
By property (b) in the definition of a flow, for all x € P, x #a, Vey f(x, v) -
diwey fw, x) = 0.
13.3. Transport Networks: The Max-Flow Min-Cut Theorem 647
Adding the results in the above equations yields
val(f) = ly fla,v)- Do fw, ° +> bs fix,v)—- > ftw, »
vev wev xeP vey wey
xFa
=) few- YO ft,»
xeP xeP
vey wev
“| y fawn+ fon] “| fwot foo
‘ep veP xe EF
Since
> f(x, v) and » f(w, x)
xeP xeP
veP weP
are summed over the same set of all ordered pairs in P < P, these summations are equal.
Consequently,
val(f)= D> f@.v)— D> flu, x).
xeP xEP
veP weP
For all x, we V, f(w, x) > 0,s0
> f(w,x)>0O and val(f)< > f(x,u) < S> c(x, v) = c(P, P).
xeP xeP xeP
weP veP veP
From Theorem 13.3 we find that in a network N, the value for any flow is less than or
equal to the capacity of any cut in that network. Hence the value of the maximum flow cannot
exceed the minimum capacity over all cuts in a network. For the network in Fig. 13.11, it
can be shown that the cut consisting of edges (d, z) and (A, z) has minimum capacity 11.
Consequently, the maximum flow f for the network satisfies val(f) < 11. It will turn out
that the value of the maximum flow is 11. How to construct such a flow and why its value
equals the minimum capacity among all cuts will be dealt with in this section.
However, before we deal with this construction, let us note that in the proof of Theo-
rem 13.3, the value of a flow is given by
val(f)= D> fx,v)— Do f(w,x),
xeP xeP
ve P weP
where (P, P) is any cut in N. Therefore, once a flow is constructed in a network, then for
any cut (P, P) in the network, the value of the flow equals the sum of the flows in the
edges directed from the vertices in P to those in P minus the sum of the flows in the edges
directed from the vertices in P to those in P.
This observation leads to the following result.
648 Chapter 13. Optimization and Matching
COROLLARY 13.1 If f is a flow in a transport network N = (V, E), then the value of the flow from the source
a is equal to the value of the flow into the sink z.
Proof: Let P = {a}, P = V — {a}, and Q = V — {z},O = {z}. From the above observation,
d) f@.v-— YO fw, x) =val(f) = DO fO.v)- DO fw, y).
xeP xeP yeo yeQ
vEeP weP veO weQ
With P = {a} andid(a) = 0, we find that )) cp.weP f(w,x)= YS eP f(w, a) = 0. Sim-
ilarly, forQ = {z} and od(z) = 0, it follows that
)) <9 wep f(W. 9) = Viyeg fF, y) =0.
Consequently,
Y> fa. v= do flav) =val(f) = Y> fo.v) = SO FO. 2).
xeP veP yeQ yeQ
VE P vE Oo
and this establishes the corollary.
Additional properties of flows and cuts in a network are given in the following corollaries.
COROLLARY 13.2 Let f be a flow in a transport network N = (V, E) and let (P, P) be a cut, where val( f) =
c(P, P). Then f is a maximum flow for the network N and (P, P) is a minimum cut [that
is, (P, P) has minimum capacity in NJ.
Proof: If | is any flow in N, then from Theorem 13.3 it follows that
val( fi) <c(P, P) = val(f),
so f is a maximum flow. Likewise, for any cut(Q, Q) in N we have
c(P, P) = val(f) <¢(Q, Q),
so (P, P) is a minimum cut— again, by Theorem 13.3.
COROLLARY 13.3 If f is a maximum flow in a transport network N = (V, E) and (P, P) is a minimum cut,
then val(f) <c(P, P).
Proof: The proof of this corollary is requested in the Section Exercises.
COROLLARY 13.4 For a transport network N = (V, E), let f be a flow in N and let (P, P) be a cut. Then
val(f) = c(P, P) if and only if
a) f(e) = c(e) for each edge e = (x, y), where x € P and ye P, and
b) f(e) = 0 for each edge e = (v, w), where v € Pandwe P.
13.3 Transport Networks: The Max-Flow Min-Cut Theorem 649
Furthermore, under these circumstances, f is a maximum flow and (P, P) is a minimum
cut.
Proof: The proof of this corollary is requested in the Section Exercises.
We turn now to the main results of the section— namely, (1) developing an efficient
algorithm to solve the Maximum Flow-Minimum Cut (Max-Flow Min-Cut) problem, and
(2) establishing the Max-Flow Min-Cut Theorem. The algorithm we introduce was initially
presented in the work of Lester R. Ford, Jr., and Delbert Ray Fulkerson. Basically, it is
designed to increase the flow in a transport network JN iteratively, until no further increase
is possible.
In order to motivate the concepts we shall need here, we start by considering the following
example.
Let N = (V, E) be the transport network shown in part (i) of Fig. 13.12. Examining the
EXAMPLE 13.9
edges (b, z) and (g, z), we see that the value of the flow is 6+ 2 = 8. But neither of
these two edges is saturated, nor is any other edge in N, so we shall try to increase the
present flow. To do so, consider a directed path from a to z—for example, the path p
made up of the edges (a, b) and (b, z) [as in part (ii) of the figure]. For this path we
define A, = mineep{c(e) — f(e)} = min{8 — 4, 8 — 6} = min{4, 2} = 2. This tells us that
the flow in each of these two edges can be increased by 2, with the conservation of flow still
maintained. The resulting network, in part (iii) of the figure, now has flow value 8 + 2 = 10.
| b b | b
8,6 8,8 8,6 8,8 | 87 8,8
a Zz a Z|a Z
6, 54 | 6, 5,5
d 6,3 d 6,5 | 6 ‘
g Ww) 9 | (vu) |
po
b D |
8,6 8,6
Z NU a 4,1 z|
6,4 re 5,2 >; 54 |
‘) (wv) ale (ui) 129 |
J
Figure 13.12
So far, so good. Now let us try to increase the flow again. This time we use the di-
rected path p; from a to z as shown in part (iv) of Fig. 13.12. This path comprises the
edges (a, d), (d, g), and(g, z) and here A,, = minz<p, {c(e) — f(e)} = min{6 — 4, 6 — 3,
5 — 2} = min{2, 3, 3} = 2. The resulting network, with the adjustment A,, Pi = 2, is shown
in Fig. 13.12(v) and it has flow value 12.
Now, at this point, any possible directed a@-z path in N [of Fig. 13.12(v)] must use either
edge (a, d) or edge (b, z), both of which are saturated — that is, c(e) = f (e). Consequently,
it may seem that the current flow of 12 is the maximum flow possible.
If, however, we disregard the directions on the edges of the network, it is possible to find
other paths from a to z. Consider one such path — the path p2 shown in part (v1) of the figure.
This undirected path comprises the edges {a, b}, {b, d}, {d, g}, and {g, z}. Here we define
650 Chapter 13 Optimization and Matching
Ap, = MiNeep, {Ae}, where A, = c(e) — f (e) for the forward edges (a, b), (d, g), (g, 2),
and A, = f(e) for the backward edge going from b to d [the opposite of the direction for
edge (d, b) in N]. So A,, = min[{8 — 6, 6 — 5, 5 — 4} U {1}] = 1. This increase of one
unit of flow is added to the flow for each of the three forward edges and subtracted from
the flow for the one backward edge. The resulting final network appears in part (vii) of
Fig. 13.12, where we see that by decreasing the flow from d to b by one unit (of flow) we
have been able to redirect this one unit from d to g and then from g to z. So now the flow
value for N is 12 + 1 = 13 and this is the maximum flow value possible — for the edges
(b, z) and (g, z) are saturated.
What has taken place in Example 13.9 now leads us to the following.
Definition 13.5 Let N = (V, E) be a transport network and let
a= vo, el, VI, €2, U2, sets Un—1: en, Up =z
be an alternating sequence of vertices and edges, where the edges are taken from the undi-
rected graph associated with N. This sequence is called a semipath.'
For 2<i<n-—1, if e; = (v;-1. v;) —that is, e; is the directed edge in N from 1;_, to
v; — then e; is called aforward edge. Inthe case where2 < j <n — lande; = (u;, vj-1) —
that is, (vj_1, vj) 1s the actual directed edge in N — then e; is called a backward edge.
When all of the edges in a semipath are forward edges (in N), then we have a directed
path from a to z in N. It is only when there is at least one backward edge (from N) that the
path in the associated undirected graph is a semipath.
Our next idea takes the notion of the semipath one step further.
Definition 13.6 Let f be a flow in a transport network N = (V, E). An f-augmenting path p is a semipath
(from a to z) where for each edge e on p we have
f(e) <c(e), fore a forward edge
f(e) > 0, for e a backward edge.
From Definition 13.6 we see that along an f-augmenting path p the flow on a forward
edge can be increased, for no such forward edge is saturated. [Note that here we could
have f(e) = 0.| For each backward edge the flow is positive, so it can be decreased (and
redirected elsewhere). The maximum possible increase or decrease is given in terms of Ag,
the tolerance on an edge e, as we learn in the following.
Definition 13.7 Let p be an f-augmenting path in a transport network N = (V, E). For each edge e on the
semipath p,
c(e) — f(e), fore a forward edge
Ae = fle), for e a backward edge.
The quantity A, is often called the tolerance on edge e.
* Some authors use the term chain or quasi-path in place of semipath.
13.3 Transport Networks: The Max-Flow Min-Cut Theorem 651
Note that in Definition 13.7 we have A, > 0 for each edge e on p. Further, we find that
Ap» = MiNecp{A-} is the maximum increase (for the forward edges) and maximum decrease
(for the backward edges) that we can have and still maintain the conservation condition in
part (b) of Definition 13.2.
Our next result formally establishes what was described in Definition 13.7 and the para-
graph that followed.
THEOREM 13.4 Let f be a flow in a transport network N = (V, E) and let p be an f-augmenting path in
N with A, = minecy{A-}. Define fi: E > N by
f(e)+A,, ep, ea torward edge
fite)= 4 f(e)-—A,, © € p, e a backward edge
fre), p.
eCE,e¢
Then f; is a flow in N with val(f;) = val(f) + Ap.
Proof: From the definition of A, we have 0 < fi(e) < c(e), for each e € E. So f; satisfies
condition (a) of Definition 13.2. To establish condition (b) of Definition 13.2 for f;, we only
need to consider those v € V where v is on the semipath p and v # a, z. So let {v;, v} and
{v, v;42} be the two edges in p that are incident with v. When we consider the net change
at v, we see in the four cases of Fig. 13.13 that this change is 0. Consequently, f; satisfies
condition (b) and is a flow.
V, Vv Vie? V, v Vi42 V, Vv Vind V, Vv Vie2
The A, additional The A, additional The A, units of flow The A, units of flow
units of flow that come units of flow that come redirected from y, into v redirected from v, into v
into v along {v,, v) are into valong (v,, v} are are counterbalanced by are counterbalanced by
counterbalanced by counterbalanced by the A, units from v,,2 the A, units that leave
the A, units that leave the A, units from v,.2 that are redirected v along {Y, V,,2)-
v along {¥, Vj4>). that are redirected away from v.
away from v.
Figure 13.13
To determine val(f;) we consider e; = (vo, vi) = (a, v1), the first edge on the
f-augmenting path p. Then e; is adjacent from the source a and it follows from part (b) of
Definition 13.3 that val( f,) = ev fila, v) = vey} fila, v) + fila, vi) =
vev—tur} F(a, v) + f(@, vi) + Ap = vey Fla, v)+ Ap = val(f) + Ap.
The result of Theorem 13.4 now helps us in characterizing a maximum flow ina transport
network.
THEOREM 13.5 Let N = (V, E) be a transport network with flow f. The flow f is a maximum flow in NV
if and only if there exists no f-augmenting path in N.
Proof: If f is a maximum flow in N, then it follows from Theorem 13.4 that there is no
f-augmenting path in NV.
Conversely, if there is no f -augmenting path in NV, consider the set of all partial semipaths
in N that start at a. We call each of these edge sets a partial semipath because it cannot
652 Chapter 13 Optimization and Matching
reach z, without contradicting the hypothesis. Let P be the union | of the vertices in these
partial semipaths. Then a € P, and P # Was z € P. Further, (P, P) is acut for N and,
i) ife = (u, w) € E withu € P, we P, then f(e) = c(e)
— otherwise, w € P;
ii) if e =(u, w) € E with we P,ue P, then f(e) = 0—otherwise, f(e) > 0 and
ueP.
Consequently, from Corollary 13.4, it follows that f is a maximum flow.
We now turn to the main result of the section.
THEOREM 13.6 The Max-Flow Min-Cut Theorem. For a transport network N = (V, £), the maximum flow
value that can be attained in N is equal to the minimum capacity over all cuts in the network.
Proof: Let f be a flow for which val(f) is a maximum. Then let (P, P) be the cut con-
structed as in Theorem 13.5. We know from Corollary 13.4 that val(f) = c(P, P). And
then Corollary 13.2 shows us that (P, P) is a minimum cut.
Now that we have dispensed with the necessary theory it is time to develop an efficient
way of determining a maximum flow and minimum cut for a given transport network N.,
The discussion in Example 13.9 might suggest that we should simply find f-augmenting
paths and use them to continue increasing the existing flow in V. However, this may prove
to be tedious and inefficient as our next example demonstrates.
Consider the transport network N = (V, E) in Fig. 13.14(i), where the initial flow is
EXAMPLE 13.10
given as f(e) = 0 for each e € E. The capacities for the edges are c(a, b) = c(b, z) =
c(a, d) = c(d, z) = 10 and c(d, b) = 1. If we use the directed paths (a, b), (b, z) and then
(a, d), (d, z) as successive f-augmenting paths, we attain the flow in part (ii) of the fig-
ure after two iterations. Here we find that val(f) = 20 and this is a maximum flow since
20 = c(P, P) for P = {a}. If, instead, we start with the directed path (a, d), (d, b), (b, z)
and then the semipath {a, b}, {b, d}, {d, z} as our first two successive f-augmenting paths,
we attain the flow in Fig. 13.14(iii) where val(f) = 2. Should we continue to alternately
use these two f-augmenting paths, we will have to perform 20 iterations in total before we
attain the flow in part (ii) of the figure.
(1) (11)
Figure 13.14
What do we observe here? The directed paths (a, b), (b, z) and (a, d), (d, z) each have
two edges, while the directed path (a, d), (d, b), (b, z) and the semipath {a, b}, {b, da},
{d, z} each have three edges. Further, note how the first iteration in Example 13.9 used a
13.3. Transport Networks: The Max-Flow Min-Cut Theorem 653
directed path with two edges, the second iteration a directed path with three edges, and the
third iteration a semipath of four edges.
The observations made in Example 13.10 suggest that for each iteration it is more ef-
ficient to use an f-augmenting path with the least number of edges. This idea was used
by Jack Edmonds and Richard M. Karp in the development of an algorithm to find such
f -augmenting paths. Their approach uses a breadth-first search and, as in Prim’s algorithm,
the vertex set V is partitioned as P U P, where P accounts for the processed vertices.
However, before we can deal with this algorithm we need one additional idea.
Definition 13.8 Let N = (V, E) be a transport network with flow f. Start to construct a breadth-first span-
ning tree T for N (as an undirected graph) using the source a as the root, and a prescribed
order for the other vertices in V. While the sink z is not a vertex in T, let e = {v, w} be the
newest edge appended in the construction of 7, with v in the present tree and w the new
vertex. The edge e is called usable if
e = (v, w) with f(e) < ce), or
e = (w, v) with f(e) > 0.
Now we are ready to deal with the following algorithm. Here the input is a transport
network N = (V, E) with flow f. The output is an f-augmenting path p, with a mini-
mum number of edges, if one exists; otherwise, the output is a minimum cut (P, P) with
c(P, P) = val(f).
The Edmonds-Karp Algorithm
Step 1: Place the source a into set P (thus initializing the set of processed vertices.)
Assign the label ( , 1) to a and set the counter 7 = 2.
Step 2: While the sink z is not in P
_ If there is a usable edge in N
. Let e = {v, w} be usable with labeled vertex v having
the smallest counter assignment
If w is unlabeled
“Label w with (v, i)
Place w in P
Increase the couriter i by 1.
Else ,
Return the minimum cut (P, P).
Step 3: If z is in P, start with z and backtrack to a using the first component of the
vertex labels. (This provides an f-augmenting path p with the smallest number of
edges.)
At this point we have finally arrived at the algorithm for determining a maximum flow
and minimum cut for a transport network N = (V, £). The original version of this algorithm
was developed by Lester R. Ford, Jr., and Delbert Ray Fulkerson. Here we shall incorporate
the previous algorithm by Jack Edmonds and Richard M. Karp in order to improve the
efficiency of the original algorithm.
654 Chapter 13 Optimization and Matching
As with the preceding algorithm, the input is again a transport network N = (V, E). The
output is a maximum flow and minimum cut for NV.
The Ford-Fulkerson Algorithm
Step 1: Define the initial flow f on the edges of N by fe)= 0 for each ¢ € E,
Step 2: Repeat
Apply the Edmonds-Karp algorithm to determine
an f-augmenting path p.
Let Ap = Mittep {Ac}.
For each e € p
If e is a forward edge
Fle)i= fe) + Ap
Else (e is a backward edge)
fle):= fle} -
Until no f-augmenting path p can be found in NV.
Return the maximum flow /f.
Step 3: Return the minimum cut (P, P) (from the last application of theEdmonds.
Karp algorithm, where no further f-augmenting path could be constructed).
Before demonstrating the use of the Ford-Fulkerson and Edmonds-Karp algorithms we
state one last corollary and some related comments. The proof of the corollary is left as an
exercise.
COROLLARY 13.5 Let N = (V, E) be a transport network where for each e € E, c(e) is a positive integer.
Then there is a maximum flow f for N, where f(e) is a nonnegative integer for each
edge e.
The definition of transport network and flow (in a transport network) may be modified to
allow nonnegative real-valued capacity and flow functions. If the capacities in a transport
network are rational numbers, then the Ford-Fulkerson algorithm will terminate and attain
a maximum flow and minimum cut. When some capacities are irrational, however, the
original algorithm developed by L. R. Ford, Jr., and D. R. Fulkerson may not terminate
correctly. Furthermore, Ford and Fulkerson [14] showed that their algorithm could result
in a flow — but that the flow need not be a maximum flow. When irrational capacities do
arise, the modification given by Edmonds and Karp [11] terminates and attains a maximum
flow. Further, the Edmonds-Karp algorithm can be implemented so that its worst-case
time-complexity is O(nm*), where n = |V|, m = |E|, for N = (V, E). (For more on the
time-complexity of this algorithm one should examine Section 6.5 of Ahuja, Magnanti, and
Orlin [2] and Chapter 26 of Cormen, Leiserson, Rivest, and Stein [7].)
_ : .
EXAMPLE 13.11 - lkerson
Use: the Ford-Fu and E Edmonds-Karp algorithms to find a maximum flow for the
transport network in Fig. 13.15(i).
In the transport network N = (V, EF) [of Fig. 13.15(i)], each edge is labeled with a pair
of nonnegative integers x, y, where x is the capacity of the edge and y = 0 indicates an
initial flow. This follows from step (1) of the Ford-Fulkerson algorithm.
13.3. Transport Networks: The Max-Flow Min-Cut Theorem 655
b 6,0 j 5,0 k b(a, 2)
J(B, 5)
a, 1) a d Zz
+o- @ e—___»>_____@_—___>__—_-#
\ d(a, 3) 2(d, 6) 3,0 5,0
g 60 h4,0m8,0n gia, 4)
‘i vali = 0 | (i) (ii Ay = 3
66,0 7 5,0 k bla, 2) f(b, 4) kj, 6)
a1) ath, 7) a d z
z(d, 9) 4, 0 5, 3
3,0
g 60 h40m8,0n g(a, 3) Ag, 5) mth, 8) g 6,0 A
(iv) val(fy} = 3 | WW) (vi) A, = 2
b6,0 5,0 k b{a, 2) (b, 4) kj, 6)
a( , 1) 2(n, 10) | a z
7, 0
3,2
g 62 h40m8,0n g(a, 3) A(g, 5) n(m, 9) g 62 h40mM8,0n
(vii) val(f) = 5 | (wii) (ix) A, =1
66,0 j/ 5,0 k bla, 2) f(b, 3) k(j, 4)
g(h, 7) h(d, 6) nim, 9) h4,1m 8,197
(x) val(f} =6 | «x) (xit) A,=2
Figure 13.15
When applying the Edmonds-Karp algorithm the prescribed order for the vertices V —
{a} will be alphabetic. Applying this algorithm for the first time, in step (1) we label a with
( ,1), place a in P, and set the counter i to 2. In step (2) we find there are three usable
(forward) edges: (a, b), (a, d), and (a, g). Following the prescribed order, we select (a, b),
label b with (a, 2), place b in P, and increase the counter to 3. Executing step (2) a second
time, we select (a, d), label d with (a, 3), place d in P, and increase the counter to 4. At
this point, step (2) is executed a third time, for edge (a, g). So we label g with (a, 4), place
g in P, and increase the counter to 5.
The edge (b, j) is usable with b having the smallest counter label. [None of the edges
(a, b), (a, d), (a, g) is uSable at this stage.] Now in step (2) the vertex / is labeled with
(b, 5), b is placed in P, and the counter is increased to 6. For the vertex d in P, the edge
656 Chapter 13 Optimization and Matching
(k, d) is not usable because the flow in this edge is 0. The next application of step (2),
consequently, results in the label (d, 6) on z, places z in P, and increases the counter to
7. But with z in P we are finished with step (2), and so we arrive at the partial breadth-
first spanning tree (for the undirected graph associated with N) rooted at a—as shown
in Fig. 13.15(ii). Backtracking in step (3) of the Edmonds-Karp algorithm now provides
the f-augmenting path p: (a, d), (d, z), where A, = min{3 — 0, 5 — 0} = 3, as shown in
Fig. 13.15(iii).
At this point, we go to step (2) of the Ford-Fulkerson algorithm and increase the flow
on (a, d) from 0 to 3 and that on (d, z) from 0 to 3. The result is the transport network in
Fig. 13.15{iv), where val(f) = 3.
We now return to the Edmonds-Karp algorithm to determine the next f-augmenting path.
The resulting partial breadth-first spanning tree for this is shown in part (v) of the figure.
The corresponding f-augmenting path p in Fig. 13.15(vi) has tolerance A, = min{3 — 0,
6 — 0, 4 —0, 5 — 3} = 2. Step (2) of the Ford-Fulkerson algorithm then provides the net-
work in Fig. 13.15(vii), where val(f) = 3+ A, = 5. The next (similar) iteration takes us
from this transport network to the one in Fig. 13.15(x), where the flow is now 6. When the
Edmonds-Karp algorithm is invoked at this stage, the resulting breadth-first spanning tree is
shown in Fig. 13.15(xi). In this application of the algorithm, after we label d with (k, 5), we
next label # because we now have the usable (back) edge (h, d) — for the flow from h tod
is 2 (> 0). Backtracking from z to a in the tree in part (xi) results in the f-augmenting path
p in part (xii) with A, = min{4 — 0,6 —0,5—0,4—0,2,4—1,8-—1,7-1} =2.
This now brings us to the transport network in Fig. 13.16(1), where val( f) = 8. If we
try to apply the Edmonds-Karp algorithm to find the next f-augmenting path, we obtain
the partial breadth-first spanning tree in Fig. 13.16(ii). At this point, P = {a, b, j, k, d} so
z ¢ P, and there are no other usable edges. Consequently, the last line of step (2) provides the
minimum cut (P, P), whereP = {g, h, m,n, z}, as shown in Fig. 13.1611). Further, from
the edges that are crossed by the dotted curve, we have val(f) = f((a@, g)) + f((d, z)) —
f((h, d)) =3+5-0=8=c(P, P).
b6,2 45,2 k b(a, 2) 4(b, 3)
kK(y, 4)
a ,1) d(k, 5)
(11) P = {a, b, j,k, a}! () (P. P)
Figure 13.16
We close this section with three examples that are modeled with the concept of the
transport network. After setting up the models, the final solution of each example is left to
the Section Exercises.
Computer chips are manufactured (in units of a thousand) at three companies, c;, cz, and
EXAMPLE 13.12
c3. These chips are then distributed to two computer manufacturers, m, and m2, through
the “transport network” in Fig. 13.17(a), where there are the three sources —c}, c2, and
c3 — and the two sinks, m,; and m2. Company c; can produce up to 15 units, company c?
up to 20 units, and company c3 up to 25 units. If each manufacturer needs 25 units, how
13.3. Transport Networks: The Max-Flow Min-Cut Theorem 657
many units should each company produce so that together they can meet the demand of
each manufacturer or at least supply them with as many units as the network will allow?
(a) (bd)
Figure 13.17
In order to model this example with a transport network, we introduce a source a and a
sink z, as shown in Fig. 13.17(b). The manufacturing capabilities of the three companies
are then used to define capacities for the edges (a, c)), (a, cz), and (a, c3). For the edges
(m;, z) and (m2, z) the demands are used as capacities. To answer the question posed here,
one applies the Edmonds-Karp and Ford-Fulkerson algorithms to this network to find the
value of a maximum flow.
The transport network shown in Fig. 13.18(a) has an added restriction, for now there are
EXAMPLE 13.13 capacities assigned to vertices other than the source and sink. Such a capacity places an
upper limit on the amount of the commodity in question that may pass through a given
vertex. Part (b) of the figure shows how to redraw the network in order to obtain one where
the Edmonds-Karp and Ford-Fulkerson algorithms can be applied. For each vertex v other
than a or z, split v into vertices v; and v2. Draw an edge from v; to v2 and label it with
the capacity originally assigned to v. An edge of the form (v, w), where v # a, w F# Z,
then becomes the edge (v2, w;), maintaining the capacity of (v, w). Edges of the form
(a, v) become (a, v1) with capacity c(a, v). An edge such as (w, z) is replaced by the edge
(w2, Z), with capacity c(w, z).
b(15) 10 (15) 6,156, 10 d,15d,
10 10
a 15 a0 Zz UN
15 N\ 5
g(20) 15 A(10) g,20g, 15,104h,
(a)
Figure 13.18
The maximum flow for the given network is now determined by applying the Edmonds-
Karp and Ford-Fulkerson algorithms to the network shown in Fig. 13.18(b).
During the practice of war games, messengers must deliver information from headquarters
EXAMPLE 13.14 (vertex a) to a field command station (vertex z). Since certain roads may be blocked or
658 Chapter 13 Optimization and Matching
destroyed, how many messengers should be sent out so that each travels along a path that
has no edge in common with any other path taken?
Since the distances between vertices are not relevant here, the graph shown in Fig. 13.19
has no capacities assigned to its edges. The problem here is to determine the maximum
number of edge-disjoint paths from a to z. Assigning each edge a capacity of 1 converts the
problem into a maximum-flow problem, where the number of edge-disjoint paths (from a
to z) equals the value of a maximum flow for the network.
b A
Y
Vv
y
Y
A
g
Y
Figure 13.19
13 13a Ake}
1. a) For the network shown in Fig. 13.20, let the capacity of
each edge be 10. If each edge e in the figure is labeled by a
function f, as shown, determine the values of s, t, w, x, and
y so that f is a flow in the network.
b) What is the value of this flow?
c) Find three cuts (P, P) in this network that have capac-
ity 30.
Figure 13.21
Figure 13.20
6. In each of the following “transport networks” two compa-
nies, ¢, and cz, produce a certain product that is used by two
2. Prove Corollaries 13.3 and 13.4.
manufacturers, m1, and m2. For the network shown in part (a) of
3. Find a maximum flow and the corresponding minimum cut Fig. 13.22, company c, can produce 8 units and company c2 can
for each transport network shown in Fig. 13.21. produce 7 units; manufacturer m, requires 7 units and manufac-
turer m2 needs 6 units. In the network shown in Fig. 13.22(b),
4. Apply the Edmonds- Karp and Ford-Fulkerson algorithms to
each company can produce 7 units and each manufacturer needs
find a maximum flow in Examples 13.12, 13.13, and 13.14.
6 units. In which situation(s) can the producers meet the man-
5. Prove Corollary 13.5. ufacturers’ demands?
13.4 Matching Theory 659
7. Find a maximum flow for the network shown in Fig. 13.23.
The capacities on the undirected edges indicate that the capac-
ity is the same in either direction. [However, for an undirected
edge a flow can go in only one direction at a time as opposed to
the situation for vertices b, g in Fig. 13.18(a).]
b 4 d 6 f
7 4 4 5
5 AS SA
(a) 6 g 4 4 i 7
a > > > —> Zz
4 A 4
v5 Y 5 5
4 5
C > >
J 6 k 4 m
Figure 13.23
(0)
Figure 13.22
13.4
Matching Theory
The Villa school district must hire four teachers to teach classes in the following subjects:
mathematics (s;), computer science (s2), chemistry (53), physics (s4), and biology (s;). Four
5;
candidates who are interested in teaching in this district are Miss Carelli (c;), Mr. Ritter
(cz), Ms. Camille (c3), and Mrs. Lewis (cs). Miss Carelli is certified in mathematics and
$2 computer science; Mr. Ritter in mathematics and physics; Ms. Camille in biology; and Mrs.
o Lewis in chemistry, physics, and computer science. If the district hires all four candidates,
s can each teacher be assigned to teach a (different) subject in which he or she is certified?
C3 3
This problem is an example of a general situation called the assignment problem. Using
Cy Sa
the Principle of Inclusion and Exclusion in conjunction with the rook polynomial (see
Sections 8.4 and 8.5), one can determine in how many ways, if any, the four teachers may
Ss be assigned so that each teaches a different subject for which he or she is qualified. However,
Figure 13.24 these techniques do not provide a means of setting up any of these assignments. In Fig. 13.24
the problem is modeled by means of a bipartite graph G = (V, E), where V is partitioned
as X UY with X = {c1, c2, ¢3, ca} and Y = {s1, 52, 53, 84, 85}, and the edges of G represent
the qualifications for the individual teachers. The edges {c;, 52}, {e2, sa}, {c3, 85}, {c4, 53}
demonstrate such an assignment of X into Y.
To examine this idea further, the following concepts are introduced.
Definition 13.9 Let G = (V, E) bea bipartite graph with V partitioned as X U Y. (Each edge of E has the
form {x, y} with x € X and y € Y.)
a) A matching in G is a subset of E such that no two edges share a common vertex in X
or Y.
660 Chapter 13 Optimization and Matching
b) A complete matching of X into Y is a matching in G such that every x € X is the
endpoint of an edge.
In terms of functions, a matching is a function that establishes a one-to-one correspon-
dence between a subset of X and a subset of Y. When the matching is complete, a one-to-one
function from X into Y is defined. The example in Fig. 13.24 contains such a function and
a complete matching.
For a bipartite graph G = (V, E) with V partitioned as X U Y, a complete matching of
X into Y requires |X| < |Y|. If |X| is large, then the construction of such a matching cannot
be accomplished just by observation or trial and error. The following theorem, due to the
English mathematician Philip Hall (1935), provides a necessary and sufficient condition for
the existence of such a matching. The proof of the theorem, however, is not that given by
Hall. A constructive proof that uses the material developed on transport networks is given.
THEOREM 13.7 Let G = (V, E) be bipartite with V partitioned as X U Y. Acomplete matching of X into
Y exists if and only if for every subset A of X, |A| <|R(A)|, where R(A) is the subset of
Y consisting of those vertices each of which is adjacent to at least one vertex in A.
Before proving the theorem, we illustrate its use in the following example.
a) The bipartite graph shown in Fig. 13.25(a) has no complete matching. Any attempt
EXAMPLE 13.15
to construct such a matching must include {x,, y;} and either {x2, y3} or {x3, ys}.
If {x2, y3} is included, there is no match for x3. Likewise, if {x3, y3} is included,
we are not able to match x2. If A = {x), x2, x3} C X, then R(A) = {y1, y3}. With
|A| = 3 > 2 = |R(A)|, it follows from Theorem 13.7 that no complete matching can
exist.
Table 13.2
A R(A) JA] | |R(A)j
d Gy 0 0
{x1} {¥1, ¥2, ¥3} 1 3
{x2} {y2} l 1
{x3} {y2, ¥3. ys} 1 3
{x4} {y4, Ys} 1 2
7 x y {Xt, X2} {¥1, Y2, yah 2 3
x, x, y; {x1, x3} (Yi, Y2,.¥3, Ya} | 2 4
{x1, x4} Y 2 5
x, % Vo {x2, x3} {y2, ¥3, Ys} 2 3
{x2, x4} {y2, Ya, ys} 2 3
Xs xy Ys {x3, x4} {v2, V3, Ya, Ys} | 2 4
{xy, x2, X3} | {¥1, 2, ¥3, Ys} | 3 &
Xy Xa Ya {x1, x2, x4} | ¥ 3 5
{X1, %3, X4} | Y 3 5
(a) (b) Ys {x2, x3, x4} | {y2, y3, yas ys} | 3 4
. xX Y 4 5
Figure 13.25
13.4 Matching Theory 661
b) For the graph in part (b) of the figure, consider the exhaustive listing in Table 13.2.
Assuming the validity of Theorem 13.7, this listing indicates that the graph contains a
complete matching.
We turn now to a proof of the theorem.
Proof: With V partitioned as X UY, let X = {x), x2,..., Xm} and ¥Y ={y), y2,..., yn}.
Construct a transport network N that extends graph G by introducing two new vertices a
(the source) and z (the sink). For each vertex x;, 1 <i <m, draw edge (a, x,); for each
vertex yj, 1 < j <n, draw edge (y;, z). Each new edge is given a capacity of 1. Let M be
any positive integer that exceeds |X|. Assign each edge in G the capacity M. The original
graph G and its associated network N appear as shown in Fig. 13.26. It follows that a
complete matching exists in G if and only if there is a maximum flow in N that uses all
edges (a, x;), 1 <i < _m. Then the value of such a maximum flow is m = |X|.
(G) (N)
x Y
Vi
Figure 13.26
We shall prove that there is a complete matching in G by showing that c(P, P) > |X|
for each cut (P, P) in N. Soif (P, P) is an arbitrary cut in the transport network N, let us
define A = X 1 Pand B= YM P.ThenA C X where we shall write A = {x1, x2,.... x}
for some 0 <i < m., (The elements of X are relabeled, if necessary, so that the subscripts
on the elements of A are consecutive. When i = 0, A = @.) Now P consists of the source
a together with the vertices in A and the set B C Y, as shown in Fig. 13.27(a). (Elements
ofY are also relabeled if necessary.) In addition, P = (X — A) U(Y — B)U {z}. If there is
an edge {x, y} with x € A and y € (Y — B), then the capacity of that edge is a summand in
c(P, P) and c(P, P) > M > |X|. Should no such edge exist, then c(P, P) is determined
by the capacities of (1) the edges from the source a to the vertices in X — A and (2)
the edges from the vertices in B to the sink z. Since each of these edges has capacity
1, c(P, P) ={X — Al + |B] =|X{—JA]+ |B|. With B > R(A), we have |B] > |R(A)|,
and since |R(A)| > |A|, it follows that |B| > |A|. Consequently, c(P, P) = |X| + (|B| —
|A|) = |X|. Therefore, since every cut in network N has capacity at least |X| and the cut
({a}, V — {a}) achieves a capacity of |X|, by Theorem 13.6 any maximum flow for NV has
662 Chapter 13 Optimization and Matching
value |X|. Such a flow will result in exactly |X| edges from X to Y having flow 1, and this
flow provides a complete matching of X into Y.
(b)
Figure 13.27
Conversely, suppose that there exists a subset A of X where |A| > |R(A)|. Let (P, P)
be the cut shown for the network in Fig. 13.27(b), with P = {a} UAU R(A) and P =
(X — A) U(Y — R(A)) U {z}. Thenc(P, P) is determined by (1) the edges from the source
a to the vertices in X — A and (2) the edges from the vertices in R(A) to the sink z.
Hence c(P, P) = |X — A] + |R(A)| = |X| — (JA] — |R(A)]) < |X|, since |A| > |R(A)|.
The network has a cut of capacity less than |X|, so once again by Theorem 13.6 it follows
that any maximum flow in the network has value smaller than |X|. Therefore there is no
complete matching from X into Y for the given bipartite graph G.
Five students, 51, 52, 53, 54, and ss, are members of three committees, c,, c2, and c3. The
EXAMPLE 13.16
bipartite graph shown in Fig. 13.28(a) indicates the committee memberships. Each com-
mittee is to select a student representative to meet with the school president. Can a selection
be made in such a way that each committee has a distinct representative?
S;
C, S>
© 5; a
G Ss
5s
(a) (b)
Figure 13.28
13.4 Matching Theory 663
Although this problem is smal] enough to solve by inspection, we use the ideas developed
in Section 13.3. Figure 13.28(b) provides the network for the given bipartite graph. Here
we consider the vertices, other than the source a, ordered as €), C2, C3, S|, 82, §3, 84, 85, Z.
In Fig. 13.29(a), the Edmonds-Karp algorithm is applied for the first time and provides the
f-augmenting path p: (a, c1), (cr, 53), (83, z) with A, = 1. Applying the Ford-Fulkerson
algorithm results in the network in part (b) of the figure, and this network indicates the
edge (c1, 3) as the start for a possible complete matching. [Many edge labels are omitted
in parts (b) and (c) of the figure in order to simplify the diagrams. Every unlabeled edge
that starts at a or terminates at z should have the label 1, 0 to indicate a capacity of 1 and
a flow of 0; all other unlabeled edges should bear the label M, 0.] The next application of
these two algorithms provides the f-augmenting path (a, cz), (c2, 51), (S;, z) and the edge
(C2, S}) to extend the matching. Finally, the last application of the Edmonds-Karp and Ford-
Fulkerson algorithms gives us the f-augmenting path (a, c3), (C3, 2), (82, z) and the final
edge — namely, (c3, s2) — for the complete matching. This is indicated by the maximum
flow in part (c) of Fig. 13.29.
$4{Cz, 7)
2(S3, 1 0)
S5(C>, 8)
(a)
Figure 13.29
This example is a particular instance of a problem studied by Philip Hall. He considered a
collection of sets A1, Az, ..., Ay, Where the elements a), a2, ... , a, were called a system
of distinct representatives for the collection if (a) a; € A,, forall 1 <i <n; and (b) a; # aj,
whenever 1 <i < j <n.Rewording Theorem 13.7 in this context, we find that the collection
A, A2,..., Ay has a system of distinct representatives if and only if, for all 1 <i <n, the
union of any! of the sets Aj, A2,..., A, contains at least 7 elements.
Although the condition in Theorem 13.7 may be very tedious to check, the following
corollary provides a sufficient condition for the existence of a complete matching.
COROLLARY 13.6 Let G = (V, E) be a bipartite graph with V partitioned as X UY. There is a complete
matching of X into Y if, for some k € Z*, deg(x) > k > deg(y) for all vertices x € X and
yey.
Proof: This proof is left for the Section Exercises.
664 Chapter 13 Optimization and Matching
a) Corollary 13.6 is applicable to the graph shown in Fig. 13.28(a). Here the appropriate
EXAMPLE 13.17
value of k is 2.
b) There are 50 students (25 females and 25 males) in the senior class at Bel! High School.
If each female in the class is appreciated by exactly five of the males, and each male
enjoys the company of exactly five of the females in the class, then it is possible for each
male to go to the class party with a female he likes and each female will attend with a
male who likes her. (As a result of problems of this type, the condition in Theorem 13.7
has often been referred to in the literature as Hail’s Marriage Condition.)
For problems such as the one in Example 13.15(a), where a complete matching does not
exist, the following type of matching is often of interest.
Definition 13.10 If G = (V, E) is a bipartite graph with V partitioned as X UY, a maximal matching in G
is one that matches as many vertices in X as possible with the vertices in Y.
To investigate the existence and construction of a maximal matching, the following new
idea is presented.
Definition 13.11 Let G = (V, E) be a bipartite graph, where V is partitioned as X UY. If AC X, then
5(A) = |A| — |R(A)} is called the deficiency of A. The deficiency of graph G, denoted
5(G), is given by 6(G) = max{d(A)|A C X}.
For 4 C X, we have R(¥) = ¥, so 5(4) = 0 and d(G) > 0. If 5(G) > 0, there is a subset
A of X with |A| — |R(A)| > 0,so0|A| > | R(A)| and from Theorem 13.7 we know that there
is no complete matching of X into Y.
The graph in Fig. 13.30(a) has no complete matching. [See Example 13.15(a).] For A =
EXAMPLE 13.18
{x1, X2, x3}, we find that R(A) = {y), y3} and 6(A) = 3 —2 = 1. As a result of this subset
A we find that 6(G) = 1. Removing one of the vertices from A (and the edges incident
with it), we obtain the subgraph shown in part (b) of the figure. This (bipartite) subgraph
contains a complete matching from X; = {xX2, x3, x4} into Y. The edges {x2, y;}, {x3, y3},
and {x4, y4} indicate one such matching that is also a maximal matching of X into Y.
x Y X, Y
x; yy 4
Xz Y2 Xz ¥2
x3 y3 X3 y3
X4 Ya Xa Va
Vs Vs
(a) (b)
Figure 13.30
13.4 Matching Theory 665
The ideas developed in Example 13.18 lead to the following theorem.
THEOREM 13.8 Let G = (V, E) be bipartite with V partitioned as X U Y. The maximum number of vertices
in X that can be matched with those in Y is |X| — 6(G).
Proof: We provide a constructive proof, using transport networks as in the proof of The-
orem 13.7. As in Figure 13.26, let N be the network associated with the bipartite graph
G. The result will follow when we show that (a) the capacity of every cut (P, P) in N is
greater than or equal to |X| — 6(G), and (b) there exists a cut with capacity |X| — 6(G).
Let (P, P) be a cut in N, where P is made up of the source a, the vertices in A =
PX CX, and the vertices in B = PMY CY. [See Fig. 13.27(a).] As in the proof of
Theorem 13.7, the subsets A, B may be @.
1) If edge (x, y) is in N with x € A and y € Y — B, then c(x, y) is a summand in
c(P, P). Since c(x, y) = M > |X|, it follows that c(P, P) > |X} > |X| — 8(G).
2) If no such edge as in (1) exists, thenc(P, P) is determined by the |X — A| edges from
ato X — A andthe |B| edges from B to z. Since each of these edges has capacity 1, we
tind that c(P, P) = |X — AJ + |B] =|X|—|A|+ |B}. No edge connects a vertex in
A with a vertex in Y — B, so R(A) C B and |R(A)| < |B]. Consequently, c(P, P)=
(|X} — |A|) + |B] = 1X] — |Al) + [R(A)| = |X] — GA] — | R(A))) = |X] — 5A) =
|X| — d(G).
Therefore, in either case, c(P, P) > |X| — 6(G) for every cut (P, P)yinN.
To complete the proof, we must establish the existence of a cut with capacity |X| — 6(G).
Since 6(G) = max{d(A)|A © X}, we can select a subset A of X with 6(G) = 6(A). Ex-
amining Fig. 13.27(b), we let P = {a} UU A U R(A). Then P = (X — A) U(Y — R(A))U
{z}. There is no edge between the vertices in A and those in Y — R(A), so c(P, P) =
|X — A] + |R(A)| = |X| — CA] — | R(A)|) = |X} — (A) = |X} — 8(G6).
We close this section with an example that deals with these concepts.
Let G = (V, E) be bipartite with V partitioned as X U Y. For each x € X, deg(x) > 4 and,
EXAMPLE 13.19
for each y € Y, deg(y) < 5. If |X| < 15, find an upper bound (as small as possible) for 5(G).
Let @ # AC X and let E, C E, where £, = {{a, b}}ja € A, b € R(A)}. Since deg(a) >
4 for all ae A, |F,| > 4|A}. With deg(b) <5 for all b € R(A), |E;| < 5|R(A)}. Hence
4|A| <5} R(A)| and (A) = |A| — |R(A)| < [A] — (4/5)|A| = C1/5)|AI. Since A C X, we
have |A| < 15,s06(A) < (1/5)(15) = 3. Consequently, 6(G) = max{é(A)|A C X} <3,so
there exists a maximal matching M of X into Y such that |M| > |X| — 3.
the associated network for the graph in part (a) and determine
a maximum flow for this network. What complete matching
does this determine? (c) Is there a complete matching that pairs
1. For the graph shown in Fig. 13.24, if four edges are selected
Janice with Dennis and Nettie with Frank? (d) Is it possible to
at random, what is the probability that they provide a complete
determine two complete matchings where each man is paired
matching of X into Y?
with two different women?
2. Cathy is liked by Albert, Joseph, and Robert; Janice by
Joseph and Dennis; Theresa by Albert and Joseph; Nettie by 3. At Rydell High School the senior class is represented on six
Dennis, Joseph, and Frank; and Karen by Albert, Joseph. and school committees by Annemarie (A), Gary (G), Jill (J), Ken-
Robert. (a) Set up a bipartite graph to model the matching prob- neth (K), Michael (M), Norma (N), Paul (P), and Rosemary
lem where each man is paired with a woman he likes. (b) Draw (R). The senior members of these committees are {A, G, J, P},
666 Chapter 13 Optimization and Matching
{G, J, K, R}, {A, M, N, P}, {A, G, M,N, P}, {A, G, K, N, R}, Cc) Ay = (1, 2}, A = {2, 3, 4}, Aj = {2, 3}, Ag = {1, 3},
and {G, K, N, R}. (a) The student government calls a meeting As = {2, 4}
that requires the presence of exactly one senior member from 9, a) Determine all systems of distinct representatives for the
each committee. Find a selection that maximizes the number of collection of sets A, = {1,2}, Az = {2, 3}, Az = {3, 4},
seniors involved. (b) Before the meeting, the finances of each Aq = {4, 1}.
committee are to be reviewed by a senior who is not on that com-
b) Given the collection of sets A, = {1,2}, Az=
mittee. Can this be accomplished so that six different seniors
{2, 3},..., Ay = {n, 1}, determine how many different
are involved in this review process? If so, how?
systems of distinct representatives exist for the collection.
4. Let G = (V, E) be a bipartite graph with V partitioned as
10. Let Aj, Ao,..., A, be a collection of sets, where A; =
X UY,whereX = {x,, x2,..., X,}and ¥Y = {y,, yo, ..., yn}.
How many complete matchings of X into Y are there if Az, =---=A, and |A,| =k > 0 for all 1 <i <n. (a) Prove
that the given collection has a system of distinct representa-
a) m=2,n=4,andG = K,,,,? tives if and only ifn <k. (b) When wn <k, how many different
b) m=4,n =4,andG = K,,,,? systems exist for the collection?
ec) m=5,n=9,andG=K,,,?
11. LetG = (V, E) bea bipartite graph, where V is partitioned
d) m<nandG = K,,,,? as X UY. If deg(x) > 4 for all x € X and deg(y) <5 for all
5. If G = (V, E) is an undirected graph, a spanning subgraph y € Y, prove that if |X| < 10 then 6(G) <2.
H of G in which each vertex has degree | is called a one-factor
12. Let G = (V, E) be bipartite with V partitioned as X UY.
(or perfect matching) for G.
For all x € X, deg(x) > 3, and for all ye Y, deg(y) <7. If
a) If G has a one-factor, prove that |V| is even. |X| < 50, find an upper bound (that is as small as possible) on
b) Does the Petersen graph have a one-factor? (The Pe- b(G).
tersen graph was first introduced in Example 11.19.)
13. a) Let G=(V, E) be the bipartite graph shown in
c) In Fig. 13.31 we find the graph K, in part (a), while part Fig. 13.32, with V partitioned as X U Y. Determine 5(G)
(b) provides the three possible one-factors for K;. How and a maximal matching of X into Y.
many one-factors are there for the graph Kg?
d) Forn € Z*, let a, count the number of one-factors that x Y
exist for the graph K>2,. Find and solve a recurrence relation xy MY
for a,.
6. Prove Corollary 13.6. x2 ¥2
7. Fritz is in charge of assigning students to part-time jobs at
the college where he works. He has 25 student applications, and
X3 y3
there are 25 different part-time jobs available on the campus.
Each applicant is qualified for at least four of the jobs, but each
job can be performed by at most four of the applicants. Can X4 Ya
Fritz assign all the students to jobs for which they are qualified?
Explain. Xs
8. For each of the following collections of sets, determine, if Figure 13.32
possible, a system of distinct representatives. If no such system
exists, explain why.
b) For any bipartite graph G = (V, E), with V partitioned
a) A; = {2, 3, 4}, Ar = {3, 4}, As = {1}, Ag = (2, 3} as X U Y, if B(G) denotes the independence number of G,
b) A; = Ap = Az = {2, 4, 5}, Ag = As = {1, 2,3, 4, 5} show that |Y| = 6(G) — 5(G). (The independence number
b
a oO a a
d c dea
(a) (b)
Figure 13.31
13.5 Summary and Historical Review 667
of an undirected graph is defined in Exercise 25 for Sec- 14. For n > 2, prove that the hypercube Q,, has at least 2°"
tion 11.5.) perfect matchings (as defined above in Exercise 5).
¢) Determine a largest maximal independent set of vertices
for the graphs shown in Fig. 13.30(a) and Fig. 13.32.
13.5
Summary and Historical Review
This chapter has provided us with a sample of the ways in which graph theory enters into an
area of mathematics called operations research. Each topic was presented in an algorithmic
manner that can be used in the computer implementation needed for solving each type of
problem. Comparable coverage of this material can be found in Chapters 10 and 11 of the
text by C. L. Liu [22]. Chapters 4 and 5 of E. Lawler [21] offer an extensive coverage of
many other developments on networks and matching. This text provides a wide variety of
applications and includes references for additional reading.
In Section 13.1 we examined a shortest-path algorithm for weighted graphs. The full
development of the algorithm is given in the article by E. W. Dijkstra [10].
Edsger W. Dijkstra (1930-2002) Joseph B. Kruskal (1928- )
Section 13.2 provided two techniques for finding a minimal spanning tree in a weighted
loop-free connected undirected graph. These techniques were developed in the late 1950s
by J. B. Kruskal [20] and R. C. Prim [25]. Actually, however, methods for constructing
minimal spanning trees can be traced back to 1926, to the work of Otakar Bortivka deal-
ing with the construction of an electric power network. Even before this (1909-1911) the
anthropologist Jan Czekanowski, in his work on various classification schemes, was very
close to recognizing the minimal spanning tree problem and to providing a greedy algorithm
for its solution. The survey paper by R. L. Graham and P. Hell [16] mentions the contribu-
tions made by Bortivka and Czekanowski and gives more information on the history and
applications of this structure.
The computer implementation of all the techniques given in the first two sections can be
found in Chapters 6 and 7 of A. V. Aho, J. E. Hopcroft, and J. D. Ullman [1]; in Chapter 8
of S. Baase and A. Van Gelder [3]; in Chapters 23 and 24 of T. H. Cormen, C. E. Leiserson,
668 Chapter 13 Optimization and Matching
R. L. Rivest, and C. Stein [7]; and in Chapter 4 of E. Horowitz and S. Sahni [17]. These
references also discuss the efficiency and speed of these algorithms. Sections 4.5-4.9 of
the text by R. K. Ahuja, T. L. Magnanti, and J. B. Orlin [2] provide more on different
implementations of Dijkstra’s algorithm, along with discussions on their features and worst-
case time-complexities. Six applications of the algorithm are described in Section 4.2 of
this text. As we mentioned at the end of Section 13.2, the articles by R. L. Graham and
P. Hell [16], by D. B. Johnson [18], and by A. Kershenbaum and R. Van Slyke [19] discuss
other implementations of Prim’s algorithm. An interesting application of the concept of the
minimal spanning tree in a physical science setting is provided in the article by D. R. Shier
[27]. Other applications are discussed in Section 13.2 of R. K. Ahuja, T. L. Magnanti, and
J.B. Orlin [2].
As we noted in Section 13.3, problems dealing with the allocation of resources or the
shipment of goods can be modeled by means of transport networks. The fundamental work
by G. B. Dantzig, L. R. Ford, and D. R. Fulkerson can be found in their pioneering articles
[8, 9, 12, 13]. The classic text by L. R. Ford and D. R. Fulkerson [14] provides excellent
coverage of this topic. In addition, the reader may wish to examine Chapter 6 of R. K.
Ahuja, T. L. Magnanti, and J. B. Orlin [2], Chapter 8 of the text by C. Berge [4], Chapter
7 of the book by R. G. Busacker and T. L. Saaty [6], or Chapter 26 of T. H. Cormen, C.
E. Leiserson, R. L. Rivest and C. Stein [7]. Chapter 10 in C. L. Liu [22] includes coverage
on an extension to networks wherein the flow in each edge is restricted by a lower as well
as an upper capacity. For more applications the reader should examine the article by D. R.
Fulkerson on pages 139-171 of [15]. Section 6.2 of R. K. Ahuja, T. L. Magnanti, and J. B.
Orlin [2] contains six additional applications.
The last topic discussed here dealt with matching in a bipartite graph. The theory behind
this was first developed by Philip Hall in 1935, but here the ideas on transport networks were
used to provide an algorithm for a solution. Chapter 7 of the text by O. Ore [24] provides a
very readable introduction to this topic, along with some applications. For more on systems
of representatives, the reader should examine Chapter 5 of the monograph by H. J. Ryser
[26]. A second method for finding a maximal matching in a bipartite graph is called the
Hungarian method. This is given in Chapter 5 of the text by J. A. Bondy and U.S. R. Murty
[5] and in Chapter 10 of the book by C. Berge [4]. In addition to its application in solving
the assignment problem, matching theory has many interesting combinatorial implications.
One may learn more about these in the survey article by L. Mirsky and H. Perfect [23].
REFERENCES
1, Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffrey D. Data Structures and Algorithms.
Reading, Mass.: Addison-Wesley, 1983.
2. Ahuja, Ravindra K., Magnanti, Thomas L., and Orlin, James B. Network Flows. Englewood
Cliffs, N.J.: Prentice Hall, 1993.
3. Baase, Sara, and Van Gelder, Allen. Computer Algorithms, Introduction to Design and Analysis,
3rd ed. Reading Mass.: Addison-Wesley, 2000.
4, Berge, Claude. The Theory of Graphs and Its Applications. New York: Wiley, 1962.
5. Bondy, J. A., and Murty, U. S. R. Graph Theory with Applications. New York: Elsevier North
Holland, 1976.
6. Busacker, Robert G., and Saaty, Thomas L. Finite Graphs and Networks. New York: McGraw-
Hill, 1965.
7. Cormen, Thomas H., Leiserson, Charles E., Rivest, Ronald L., and Stein, Clifford. Introduction
to Algorithms, 2nd ed. New York: McGraw-Hill, 2001.
8. Dantzig, George B., and Fulkerson, Delbert Ray. Computation of Maximal Flows in Networks.
The RAND Corporation, P-677, 1955.
Supplementary Exercises 669
9. Dantzig, George B., and Fulkerson, Delbert Ray. On the Max Flow Min Cut Theorem. The
RAND Corporation, RM-1418-1, 1955.
10. Dijkstra, Edsger W. “A Note on Two Problems in Connexion with Graphs.” Numerische Math-
ematik 1 (1959): pp. 269-271.
. Edmonds, Jack, and Karp, Richard M. “Theoretical Improvements in Algorithmic Efficiency
—
—
for Network Flow Problems.” J. Assoc. Comput. Mach. 19 (1972): pp. 248-264.
12. Ford, Lester R., Jr. Network Flow Theory. The RAND Corporation, P-923, 1956.
13. Ford, Lester R., Jr., and Fulkerson, Delbert Ray. “Maximal Flow Through a Network.” Canadian
Journal of Mathematics 8 (1956): pp. 399-404.
14. Ford, Lester R., Jr., and Fulkerson, Delbert Ray. Flows in Networks. Princeton, N.J.: Princeton
University Press, 1962.
15. Fulkerson, Delbert Ray, ed. Studies in Graph Theory, Part I. MAA Studies in Mathematics,
Vol. 11, The Mathematical Association of America, 1975.
16. Graham, Ronald L., and Hell, Pavol. “On the History of the Minimum Spanning Tree Problem.”
Annals of the History of Computing 7, no. 1 (January 1985): pp. 43-57.
17. Horowitz, Ellis, and Sahni, Sartaj. Fundamentals of Computer Algorithms. Potomac, Md.: Com-
puter Science Press, 1978.
18. Johnson, D. B. “Priority Queues with Update and Minimum Spanning Trees.” Information
Processing Letters 4 (1975): pp. 53-57.
19. Kershenbaum, A., and Van Slyke, R. “Computing Minimum Spanning Trees Efficiently.” Pro-
ceedings of the Annual ACM Conference, 1972, pp. 518-527.
20. Kruskal, Joseph B. “On the Shortest Spanning Subtree of a Graph and the Traveling Salesman
Problem.” Proceedings of the AMS 1, no. | (1956): pp. 48-50.
21. Lawler, Eugene. Combinatorial Optimization: Networks and Matroids. New York: Holt, 1976.
22. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
23. Mirsky, L., and Perfect, H. “Systems of Representatives.” Journal of Mathematical Analysis
and Applications 3 (1966): pp. 520-568.
24. Ore, Oystein. Theory of Graphs. Providence, R.I.: American Mathematical Society, 1962.
25. Prim, Robert C. “Shortest Connection Networks and Some Generalizations.” Bell System Tech-
nical Journal 36 (1957): pp. 1389-1401.
26. Ryser, Herbert J. Combinatorial Mathematics. Carus Mathematical Monographs, Number 14,
Mathematical Association of America, 1963.
27. Shier, Douglas R. “Testing for Homogeneity Using Minimum Spanning Trees.” The UMAP
Journal 3, no. 3 (1982): pp. 273-283.
>. SUPPLEMENTARY EXERCISES
1. Apply Dijkstra’s algorithm to the weighted directed multi-
graph shown in Fig, 13.33, and find the shortest distance from
vertex a to the other seven vertices in the graph. 7 NB 4
2. For her class in the analysis of algorithms, Stacy writes
the following algorithm to determine the shortest distance LS
from a vertex a to a vertex b in a weighted directed graph
G=(V,B). ON
Db
Step 1: Set Distance equal to 0, assign vertex a to the vari- 14> f
able vy and let T = V. Figure 13.33
Step 2: If v = b, the value of Distance is the answer to the
problem. If v # 4, then 2) Set Distance equal to Distance + wt(v, w).
1) Replace 7 by 7 — {v} andselectw € T with wt(v, w) 3) Assign vertex w to the variable v and then return to
minimal. step (2).
670 Chapter 13 Optimization and Matching
Is Stacy’s algorithm correct? If so, prove it. If not, provide each row or column is 1. If
a counterexample. 0.2 0.1 0.7
3. a) Let G = (V, E) bea loop-free weighted connected undi- B=/|04 05 O17,
rected graph. If e; € E with wt(e,) < wt(e) for all other 04 04 0.2
edges e € E, prove that edge e; is part of every minimal verify that B is doubly stochastic.
spanning tree for G.
c) Find four positive real numbers c), ¢2, c3, and c4, and four
b) With G as in part (a), suppose that there are edges permutation matrices P,, P2, P3, and Py, such that ec; + cz +
€1, €2 € E with wt(e,) < wt(e2) < wt(e) for all other edges ez +c¢4= land B=c)P, +c. P, +¢3P; + 4Py.
e € E. Prove or disprove: Edge e2 is part of every minimal
d) Part (c) is a special case of the Birkhoff-von Neumann
spanning tree for G.
Theorem: If B is an n Xn doubly stochastic matrix, then
4. a) Let G = (V, E) bea loop-free weighted connected undi- there exist positive real numbers c), ¢2,..., Cc, and per-
rected graph where each edge e of G is part of a cycle. Prove mutation matrices P;, P),..., P, such that }°*_,¢; =1
that if e, € E with wt(e,) > wt(e) for all other edges e € E, and )>*_, c;P; = B. To prove this result, proceed as fol-
then no spanning tree for G that contains e, can be minimal. lows: Construct a bipartite graph G = (V, E) with V par-
b) With G as in part (a), suppose that e),e2<¢ EF with titioned as X UY, where X = {x,, %2,...,X,} and Y =
wt(é,) > wt(e2) > wt(e) for all other edges e € E. Prove or {¥1, Yo, ---, Ya}. The vertex x,, for all | <i <x, corre-
disprove: Edge e2 is not part of any minimal spanning tree sponds with the ith row of B; the vertex y,, forall | <j <n,
for G. corresponds with the jth column of B. The edges of G are of
the form {x,, y,} if and only if b;; > 0. We claim that there
5. Using the concept of flow in a transport network, construct
is acomplete matching of X into Y.
a directed multigraph G = (V, E)}, with V = {u, v, w, x, y}
and id(u) = 1, od(u) = 3; id(v) = 3, od(v) = 3; id(w) = 3, If not, there is a subset A of X with |A| > |R(A)]. That
od(w)} = 4; id(x) = 5, od(x) = 4; and id(y) = 4, od{y) = 2. is, there is a set of r rows of B having positive entries in s
columns and r > s. What is the sum of these r rows of B?
6. Aset of words {qs, tg, ut, pgr, srt} is to be transmitted us- Yet the sum of these same entries, when added column by
ing a binary code for each letter. (a) Show that it is possible to column, is less than or equal] to s. (Why?) Consequently, we
select one letter from each word as a system of distinct repre- have a contradiction.
sentatives for these words. (b} If a letter is selected at random As a result of the complete matching of X into Y, there
from each of the five words, what is the probability that the are n positive entries in B that occur so that no two are in
selection is a system of distinct representatives for the words? the same row or column. (Why?) If c; is the smallest of
these entries, then we may write B = c, P; + B,, where P;
7. For ne€Z* and for each 1<i<n, let A, = {1,2,
is an X nv permutation matrix wherein the 1’s are located
3,...,n} — {i}. How many different systems of distinct repre-
according to the positive entries in B that came about from
sentatives exist for the collection A;, Az, A3,..., A,?
the complete matching. What are the sums of the entries in
8. This exercise outlines a proof of the Birkhoff-von Neumann the rows and columns of B,?
Theorem. e) How is the proof completed?
a) Forn € Z*, ann X n matrix is called a permutation ma- 9. Let G = (V, E) bea bipartite graph with V partitioned as
trix if there is exactly one | in each row and column, and all X UY.If E' C E, and E’ determines a complete matching of
other entries are 0. How many 5 X 5 permutation matrices X into Y, what property do the vertices determined by E’ in
are there? How many n X n? the line graph L(G) have? [The line graph L(G) for a loop-free
b) Ana X n matrix B is called doubly stochastic if b,, > 0 undirected graph G is defined in Supplementary Exercise 18 of
for all 1 <i <n, 1 <j <n, and the sum of the entries in Chapter 11.]
PART
4
MODERN APPLIED
ALGEBRA
14
Rings and Modular
Arithmetic
n this fourth and final part of the text, the emphasis will be on structure again as we
begin the investigation of sets of elements that are closed under two binary operations.
The concepts of structure and enumeration often reinforce each other. Here we will see this
occur as ideas seen in Chapters 1, 4, 5, and 8 come to the forefront again.
When we examined the set Z in Chapter 4, it was in conjunction with the closed binary
operations of addition and multiplication. In this chapter we emphasize these operations by
writing (Z, +, +), instead of just Z. Patterned after some of the properties of (Z, +, -), the
algebraic structure called a ring will be defined. Without knowing it, we have been dealing
with rings in many mathematical settings. Now we shall be concerned with finite rings that
arise in number theory and computer science applications. Of particular interest in the study
of computer science 1s the hashing function, which we find provides a means of identifying
records stored in a table.
14.1
The Ring Structure:
Definition and Examples
We start by defining the ring structure, realizing as we do that most abstract definitions, like
theorems, come about from a study of many examples where one recognizes the common
idea or ideas present in what may seem to be a collection of unrelated objects.
Definition 14.1 Let R be a nonempty set on which we have two closed binary operations, denoted by +
and - (which may be quite different from the ordinary addition and multiplication to which
we are accustomed). Then (R, +, -) is a ring if for alla, b, c € R, the following conditions
are satisfied:
a) a+b=b+a Commutative Law of +
b) a+(6+c)=(a+b))+e Associative Law of +
c) There exists z € R such that Existence of an identity for +
a+z=z+a=aforeveryae R.
d) For eacha € R there is an element Existence of inverses under +
be Rwithha+b=—b+a=z.
e) a-(b-c) =(a-b)-c Associative Law of -
673
674 Chapter 14 Rings and Modular Arithmetic
f) a-(b+c)=a-b+a-c Distributive Laws of - over+
(b+c)-a=b-at+c-a
Since the closed binary operations of + (ring addition) and - (ring multiplication) are
both associative, no ambiguity will arise if we write a+ b+c for either (a + b) +c or
a+(b+c), ora-b-c for either (a: b)-c or a: (b-c). When dealing with the (closed)
binary operation of ring multiplication, we shall often write ab for a - b. In addition, we
can extend the associative laws (given in the definition of a ring) as we did in Exercises 8
and 9 of Section 4.2. Using mathematical induction, it can be verified that for all r, n € Z*,
withn >3and1<r<nz,
(a) +g +++ + +4) 4+ (Grq1 tes tan) = a) tan +++) +4, +44) +++ +a,
and
(Qjd2 + - Ap )(Ap41 + An) = A1G2
+++ ApAryy + An,
where @|, @2,..., Gr, Gp41, +++, Ay are elements ofa given ring (R, +, +). Inacorrespond-
ing way, the distributive laws generalize as follows:
a(b, + by + +--+
by) = aby + aby +--+
+ aby,
(b} tbo +---+6,)a=bia+hat+---+b,a,
for arbitrary ring elements a, b), bx, ..., b, and all n € Z* where n > 3.
In the next section we shall learn that the additive identity (or zero element) is unique,
as is the additive inverse of each ring element. For now, let us consider some examples of
rings.
EXAMPLE 14.1 Under the (closed) binary operations of ordinary addition and multiplication, we find that
: Z, Q, R, and C are rings. In all of these rings the additive identity z is the integer 0, and
the additive inverse of each number x is the familiar —x.
EXAMPLE 14.2 Let M2(Z) denote the set of all 2 X 2 matrices with integer entries. [The sets M>(Q),
. M2(R), and M2(C) are defined similarly.] In M@2(Z) two matrices are equal if their corre-
sponding entries are equal in Z.
Here we define + and - by
a b 4|¢ f\|_|late b+f a b\le f|_|ae+bg af+bh
c d g ih c+g d+h|’ c d\|ig h ce+dg cf+dh|-
; . ; ; 0 0 Lo
Under these (closed) binary operations, M2(Z) is aring. Here z = 0 0 and the additive
verseof}
inver p}? |S] _
P)i,|-4 mdag |:
A few things happen here, however, that do not occur in the rings of Example 14.1. For
example, : j : [3 V['°
lab 0 [ i
14.1. The Ring Structure: Definition and Examples 675
shows that multiplication need not be commutative in a ring. That is why there are two
distributive laws. Also,
1 —-1]//2 1 _ 0 0
—] 1};/2 1 0 O|°
even though neither | 71] nor 5 1 is the additive identity. Hence a ring may
contain what are called proper divisors of zero —that is, nonzero elements whose product
is the zero element of the ring.
We extend our study of the ring structure in the following.
Definition 14.2 Let (R, +, +) be a ring.
a) If ab = ba for alla, b € R, then R is called a commutative ring.
b) The ring R is said to have no proper divisors of zeroifforalla,be Rab=z>a=z
or b = z.
c) If an element u € R is such that u # z and au = ua =a for alla € R, we call ua
unity, or multiplicative identity, of R. Here R is called a ring with unity.
It follows from part (c) of Definition 14.2 that whenever we have a ring R with unity,
then R contains at least two elements. Furthermore, if a ring has a unity we shall learn in
the next section that it is unique.
The rings in Example 14.1 are all commutative rings whose unity is the integer 1. None of
these rings has any proper divisors of zero. Meanwhile, the ring />(Z) is anoncommutative
; 1 _ 1 0 , ; _
ring whose unity is the matrix | This ring does contain proper divisors of zero.
0 1
Also, whenever we want to verify that a particular structure (R, +, +) is a ring, we can
start by showing that F is closed under both binary operations. Then we can continue and
verify conditions (a)—(e) of Definition 14.1. Before we try to establish the distributive laws,
however, we might want to first determine if the multiplication operation is commutative.
Should we find this operation to be commutative, then we need only establish one of the
distributive laws (for the other will follow automatically). Further, if we are able to verify
all of the preceding conditions, then we’ll know that (R, +, -) is not just a ring, but a
connnutative ring.
Now let-us study another example as we further investigate the ideas set down in Defi-
nitions 14.1 and 14.2.
Consider the set Z together with the binary operations of @ and ©, which are defined by
EXAMPLE 14.3
x®y=x+y-l, XxOy=x+y-xy.
Consequently, here we find, for instance, that3 67 =3+7-—1=9and3O7=3+4+7-
3-7=-Ill1.
Since ordinary addition, subtraction, and multiplication are closed binary operations for
Z, these new binary operations — namely, 6 and © —are also closed for Z. In fact, we
shall find that (Z, @, ©) is a ring.
676 Chapter 14 Rings and Modular Arithmetic
a) In order to verify that (Z, @, ©) is aring we must establish the six conditions given in
Definition 14.1. We shall examine three of these conditions and leave the other three
for the Section Exercises.
1) First, since ordinary addition is a commutative binary operation for Z, we find that
for all x, y € Z,
x@®y=xt+y-l=yt+x-l=y@x.
So the binary operation © is also commutative for Z.
2) When we examine condition (c) we realize that we need to find an integer z such
that a @z = z@a = a, for every a in Z. Therefore, we must solve the equation
a+z-— 1 =a, which leads us to z = 1. Hence the nonzero integer 1 is the zero
element (or additive identity) for ®.
3) What about additive inverses? At this point if we are given an (arbitrary) integer
a, we want to know if there is an integer b such thata 6b = b @a =z. From
part (2) above and the definition of @ this says that the integer b must satisfy
a+b—1=1,and it follows that b = 2 — a. So, for instance, the additive inverse
of 7 is 2 — 7 = —5 and the additive inverse for —42 is 2 — (—42) = 44. After all,
in the case of 7 we find that 7 6 (—5) = 7+ (—5) —1=7-—5-—1= 1, where1
is the additive identity. [Note: Since we showed in part (1) that 6 is commutative,
we also know that (—5) @7 = 1.]
b) Furthermore, the ring (Z, ®, ©) also possesses the additional properties given in
Definition 14.2. In particular this ring has a unity (that is, a multiplicative identity). To
determine the unity, let a be any integer and consider the element u (4 z = 1) where
aQu=uOQOad=a.
Sincea Ou =a+u— au, we solvea + u — au = ato find that u(1 — a) = 0. Since
a 1s arbitrary, this must hold even when a # 1. Consequently, the integer u = 0 is the
unity for the ring (Z, ®, ©).
After these examples of infinite rings, we turn now to rings with finitely many elements.
Y= — op(o R by
EXAMPLE 14.4 Let U = {1, 2} and R = PU). Define + and - on the elements of
A+ B=AAB= {x|x €Aorx € B, but not both}
A-B=AQB = the intersection of sets A, B CU.
We form Tables 14.1(a) and (b) for these operations.
From results in Chapter 3, one finds that R satisfies conditions (a), (b), (e), and (f) of
Definition 14.1 for these (closed) binary operations of “addition” and “multiplication.” The
table for “addition” shows that 4 is the additive identity. For each x € R, the additive inverse
of x is x itself. The multiplication table is symmetric about the diagonal from the upper left
to the lower right, so the operation described by the table is commutative. The table also
indicates that R has unity U. So R is a finite commutative ring with unity. The elements
{1}, {2} provide an example of proper divisors of zero.
14.1. The Ring Structure: Definition and Examples 677
Table 14.1
+(A) | @ {1} {2} CU “(9) |B {1} 2} U
Z J {1} {2} U b J A A 0
{1} {1} 0 ou {2} {1} |B {1} b {1}
{2} {2} WU b {1} {2} | @ v {2} {2}
OU U {2} {1} h OU J {1} {2} YU
(a) (b)
EXAMPLE 14.5 For R = {a, b, c, d, e} we define + and - by Tables 14.2(a) and (b).
Table 14.2
+ fa b c d e a b e d e
ala b 7 d e ala a a a a
b |b c d e a bia b c d e
cle d é a b c|4 c € b d
d\d : a b c dja d b e Cc
ele a bc d e|@ e dc 6b
(a) (b)
Although we do not verify them here, the 125 equalities needed to establish each of
the associative laws and the distributive laws all hold, so (R, +, -) turns out to be a finite
commutative ring with unity, and it has no proper divisors of zero. The element a is the zero
(that is, the additive identity) of R, whereas b is the unity. Here every nonzero element x has a
multiplicative inverse y, where xy = yx = b, the unity. Elements c and d are multiplicative
inverses of each other; b is its own inverse, as Is e.
We now consider the concept of a multiplicative inverse for a ring element in general.
Definition 14.3 Let R be a ring with unity u. Ifa © R and there exists b € R such that ab = ba = u, then
b is called a multiplicative inverse of a and a is called a unit of R. (The element b is also a
unit of R.)
In Section 14.2 we shall see that if a ring element does have a multiplicative inverse,
then it has only one such inverse. In the meantime, we’ll examine two special kinds of ring
structures.
Definition 14.4 Let R be a commutative ring with unity. Then
a) R is called an integral domain if R has no proper divisors of zero.
b) R is called a field if every nonzero element of R is a unit.
678 Chapter 14 Rings and Modular Arithmetic
The ring (Z, +, +) is an integral domain but not a field, while Q, R, C, under ordinary
addition and multiplication, are both integral domains and fields. The ring in Example 14.5
is both an integral domain and a field.
It follows from part (c) of Definition 14.2 that if R is an integral domain or a field, then
|R| > 2.
For our last ring of this section we let R = {s, tf, v, w, x, y} and + and + are given by
EXAMPLE 14.6
Tables 14.3(a) and(b).
Table 14.3
+ t v w Xx y 5 t v w Xx y
Ss Ss t v w x y S| Ss S Ss Ss Ss Ss
t t v w Xx y S f js f v w x y
Vv v w x y Ss t v | s v x Ss v x
w | w x y Ss t v wshs w 5 w Ss Ww
x | x y S t v Ww x | 8 x v S x v
y | y Ss t v Ww x y|s y Xx w v t
(a) (b)
From these tables we see that (R, +, «) is acommutative ring with unity, but it is neither
an integra! domain nor a field. The element tf is the unity, and f and y are the units of R.
We also note that vv = vy, and even though v is not the zero element of R, we cannot
cancel and say that v = y. So a general ring does not satisfy the cancellation law of mul-
tiplication that we may sometimes take for granted. We shall look at this idea again in the
next section.
5. Consider the set Z together with the binary operations ®
EXERCISES 14.1 and © given in Example 14.3. (a) Verify the associative laws
for ® and © and the distributive laws in order to complete the
1. Find the additive inverse for each element in the rings of
work started in part (a) of Example 14.3. [This now establishes
Examples 14.5 and 14.6. that (Z, @, ©) is a ring.] (b) Is this ring commutative? (c) In
2. Determine whether or not each of the following sets of part (b) of Example 14.3 we showed that 0 is the unity for
numbers is a ring under ordinary addition and multiplication. (Z, @, ©). What are the units for this ring? (d) Is this ring an
integral domain? a field?
a) R = the set of positive integers and zero
b) R = {kn|n € Z, k a fixed positive integer} 6. Define the binary operations @ and © on Z by x Py =
Xx+y—-7,xOy=x+y—3xy, forall x, y ¢ Z. Explain why
c) R= {a+bV2\a,beZ} (Z, @, ©) is not aring.
d) R = {a+bV24+cV3|a€Z,
b,c EQ}
7. Let k, m be fixed integers. Find all values for k, m for
3. Let (R, +, -) be aring with a, b, c, d elements of R. State which (Z, 6, ©) is a ring under the binary operations x @ y =
the conditions (from the definition of a ring) that are needed to x+y-—k,xOy=x+y—-—mxy, where x, y € Z.
prove each of the following results.
8. Tables 14.4(a) and (b) make (R, +, -) into a ring, where
a) (a+b)+c=b+(c+a)
R = {s, t, x, y}. (a) What is the zero for this ring? (b) What is
b) d+a(b+c)=ab+ (d+ac) the additive inverse of each element? (c) What is f(s + xy)?
ce) c(d+b)+ab=(a+c)b+cd (d) Is R acommutative ring? (e) Does R have a unity? (f) Find
d) a(be) + (ab)d = (ab)(d +0) a pair of zero divisors.
4. For the set R in Example 14.4, keep A- B = AM B, but 9, Define addition and multiplication, denoted by @ and ©,
define A+ B = AUB, Is (R, U, M) aring? respectively, on the set Q as follows. Fora,b€Q,a@b=
14.2. Ring Properties and Substructures 679
Table 14.4
+ |] 8 t x y Ss t x y
Ba[e alo[e alls a]>[o a]
b) Show that ; : is a unit in the ring /@>(Q) but nota
so} y x Ss t s.y y x x
t x y t § t y y XxX x unit in M> (Z).
x | Ss t x y x | x Xx x x a b a
y | t s y x y |x Xx x x 13. If |“ A € M,(R), prove that |“ A is a unit of this
(a) (b) ring if and only if ad — bc # 0.
14. Give an example of a ring with eight elements. How about
at+b+7,a©Qb=a+b-+4 (ab/7). (a) Prove that (Q, 8, ©) one with 16 elements? Generalize.
is a ring. (b) Is this ring commutative? (c) Does the ring have a 15. For R = {s, t, x, y}, define + and -, making R into a ring,
unity? What about units? (d) Is this ring an integral domain? a by Table 14.5(a) for + and by the partial table for - in Table
field? 14.5(b).
10. Let (Q, 6, ©) denote the field where @ and © are
Table 14.5
defined by
+/5 t x y S f x y
ageb=a+b-—k, aQb=a+b+ (ab/m),
for fixed elements k, m (# 0) of Q. Determine the value for k s | s t x y s | s s s s
and the value for m in each of the following. t | ¢t S y x t | s t 2 ?
a) The zero element for the field is 3. x | x y S t x | 8s t 2 y
y ly x t s y|s ? s 7
b) The additive inverse of the element 6 is —9.
c) The multiplicative inverse of 2 is 1/8. (a) (b)
ll. Let R = {a+ bila,b eZ, it =—1}, with addition and a) Using the associative and distributive laws, determine
multiplication defined by (a+ 6i)+(e+di)=(ate)+ the entries for the missing spaces in the multiplication table.
(b+ d)i and (a + bi)(e + di) = (ac — bd) + (bc + ad)i, re-
b) Is this ring commutative?
spectively. (a) Verify that R is an integral domain. (b) Deter-
mine all units in R. c) Does it have a unity? How about units?
. . . 9
12. a) Determine the multiplicative inverse of the matrix d) Is the ring an integral domain or a field?
3 — that is, find a, b, c, d so that
| in the ring M3(Z)
14.2
Ring Properties and Substructures
In each ring of Section 14.1 we were concerned with the zero element of the ring and the
additive inverse of each ring element. It is time now to show, along with other properties,
that these elements are truly unique.
THEOREM 14.1 In any ring (R, +, °),
a) the zero element z is unique, and
b) the additive inverse of each ring element is unique.
Proof:
a) If R has more than one additive identity, let z}, z2 denote two such elements. Then
Z)= 2,422
= 22.
; \
Since z, is an Since z, is an
additive identity additive identity
680 Chapter 14 Rings and Modular Arithmetic
b) For a € R, suppose there are two elements b,c € R where a+b =b+a =z and
ate=c+a=z. Thnb=b4+z2=64 (a+c)=(b+a)+c=z+c
=c. (The
reader should supply the condition that establishes each equality.)
As aresult of the uniqueness in part (b), from this point on we shall denote the additive
inverse of a € R by —a. Further, we may now speak of subtraction in the ring, where we
understand that a —b =a+(—6).
From Theorem 14.1(b) we also obtain the following for any ring R.
THEOREM 14.2 The Cancellation Laws of Addition. For all a, b, c € R,
aja+b=a+c>5b=c,and
b)b+a=ct+asb=c.
Proof:
a) Since a € R, it follows that —a € R and we have
at+tb=a+ecx>(-a)+ (a+b) = (-a)+ (atc)
=> [(-a)+a]+b=[(-a)+a]+ec
=z+tb=z+c>b=c.
b) We leave this similar proof for the reader.
Note that when we examine the addition table for a finite ring we find that each element
of the ring occurs exactly one time in each row and column of the table. This is a direct
consequence of Theorem 14.2 — where part (a) handles the rows and part (b) the columns.
THEOREM 14.3 For any ring (R, +, -) and any a € R, we have az = za = z.
Proof: If a <¢ R, then az =a(z+z) because z+z=z. Hence 7+ az =az=az+az.
(Why?) Using the cancellation law of addition, we have z = az.
The proof that za = z is done similarly.
The reader may feel that the result of Theorem 14.3 is obvious. But we are not dealing
with just Z or Q or M2(Z). Our objective is to show that any ring satisfies such a result,
and to get the result we may only use the conditions in the definition of a ring and whatever
properties we’ ve derived for arbitrary rings up to this point.
The uniqueness of additive inverses [from part (b) of Theorem 14.1] now implies the
following result.
THEOREM 14.4 Given aring (R, +, -), foralla, be R,
a) —(—a) =a,
b) a(—b) = (—a)b = —(ab), and
c) (—a)(—b) = ab.
14.2. Ring Properties and Substructures 681
Proof:
a) By the convention stated after Theorem 14.1, —(—a) denotes the additive inverse of
—a. Since (—a) + a = z, a is also an additive inverse for —a. Consequently, by the
uniqueness of such inverses, —(—a@) = a.
b) We shall prove that a(—b) = —(ab) and shall leave the other part for the reader.
We know that —(ab) denotes the additive inverse of ab. However, ab + a(—b) =
al(b + (—b)] = az = z, by Theorem 14.3, so by the uniqueness of additive inverses,
a(—b) = —(ab).
c) Here we establish an idea we have used in algebra since our first encounter with signed
numbers. “Minus times minus does indeed equal plus,” and the proof follows from the
properties and definition of a ring. From part (b) we have (—a)(—b) = —[a(—b)] =
—[—(ab)], and the result then follows from part (a).
For the operation of multiplication one also finds the following, which is comparable to
Theorem 14.1.
THEOREM 14.5 Fora ring (R, +, -),
a) if R has a unity, then it is unique, and
b) if RX has a unity, and x is a unit of R, then the multiplicative inverse of x is unique.
Proof: The proofs of these results are left to the reader.
As aresult of this theorem, when (R, +, -) is a ring with unity, we shall denote the unity
by u. Furthermore, in such a ring the multiplicative inverse of each unit x will be denoted
by x~'. Also, one may now restate the definition of a field as a commutative ring F with
unity, such that for allx e F,x #z>x7' EF.
With this notion to assist us, we examine some further properties and relations between
fields and integral domains.
THEOREM 14.6 Let (R, +, -) be acommutative ring with unity. Then R is an integral domain if and only if,
for all a, b, c € R where a # z, ab = ac => b = c. (Hence, a commutative ring with unity
that satisfies the cancellation law of multiplication is an integral domain.)
Proof: If R is an integral domain and x, y € R, then xy =z> x =z or y =z. Now if
ab = ac, then ab — ac = a(b — c) = z, and because a # z, it follows that b —c = z or
b = c. Conversely, if R is commutative with unity and R satisfies multiplicative cancella-
tion, then leta, b € R withab = z. Ifa = z, weare finished. If not, as az = z, we can write
ab = az andconclude that b = z. So there are no proper divisors of zero and R is an integral
domain.
Before going on, let us realize that the cancellation law of multiplication does not imply
the existence of multiplicative inverses. The integral domain (Z, +, -) satisfies multiplica-
tive cancellation, but it contains only two elements— namely, 1 and —1 —that are units.
Hence, an integral domain need not be a field. But what about a field? Is it necessarily an
integral domain?
682 Chapter 14 Rings and Modular Arithmetic
THEOREM 14.7 If (F, +, -) is a field, then it is an integral domain.
Proof: Let a, b € F with ab = z. If a =z, we are finished. If not, a has a multiplicative
inverse a~! because F is a field. Then
ab=z>a (ab) =a'z3 (a 'a)b=a'zSub=z>b=2z.
Hence F has no proper divisors of zero and is an integral domain.
In Chapter 5 we found that functions f: A > A could be one-to-one (or onto) without
being onto (or one-to-one). However, if A were finite, such a function f was one-to-one if
and only if it was onto. (See Theorem 5.11.) The same situation occurs with finite integral
domains. An integral domain need not be a field, but when it is finite we find that this
structure is a field.
THEOREM 14.8 A finite integral domain (D, +, -) is a field.
Proof: Since D is finite, we may list the elements of D as {d|, dz, ..., d,}. Ford € D, where
d #z,wehavedD = {dd,, ddz, ..., dd,} © D because D is closed under multiplication.
Now |D| =n and dD C D, so if we could show that dD contains n elements, we would
have dD = D. If |dD| <n, then dd, = dd,, for some 1 <i < j <a. But since D is an
integral domain and d # z, we have d, = d;, when they are supposed to be distinct. So
dD = D and for some 1 <k <n, dd = u, the unity of D. Then dd, = u => d is a unit of
D, and since d was chosen arbitrarily, it follows that (D, +, +) is a field.
From the proof of Theorem 14.8 we also realize that when we are dealing with the non-
zero elements of a finite field, the multiplication table for these elements is such that each
element of the field occurs exactly once in each of the rows and columns.
In the next section we shall look at finite fields that are useful in discrete mathematics.
Before closing this section, however, let us examine some special subsets of a ring.
When we were dealing with finite state machines in Chapter 6, we saw instances where
subsets of the set of internal states gave rise to machines on their own (when the next state
and output functions of the original machines were suitably restricted). These were called
submachines. Since closed binary operations are special kinds of functions, we encounter
a similar idea in the following definition.
Definition 14.5 For a ring (R, +, -), anonempty subset S of R is called a subring of R if (S, +, -)— that
is, S under the addition and multiplication of R, restricted to § — is a ring.
For every ring R, the subsets {z} and R are always subrings of R.
EXAMPLE 14.7
a) The set of all even integers is a subring of (Z, +, -). In fact, for each n € Z*, nZ =
EXAMPLE 14.8
{nx|x € Z} is a subring of (Z, +, -).
b) (Z, +, -) is a subring of (Q, +, +), which is a subring of (R, +, -), which is a subring
of (C, 4+, +).
14.2 Ring Properties and Substructures 683
In Example 14.6, the subsets $ = {s, w} and T = {s, v, x} are subrings of R.
EXAMPLE 14.9
The next result characterizes those subsets of a ring that are subrings.
THEOREM 14.9 Given a ring (RX, +, -), a nonempty subset S of R is a subring of R if and only if
1) foralla, b € S, we havea + b,ab € S (that is, § is closed under the binary operations
of addition and multiplication defined on R), and
2) for all a € S, we have —a € S.
Proof: If (5, +, -) is a subring of R, then in its own right it satisfies all the conditions of a
ring. Hence it satisfies conditions 1 and 2 of the theorem. Conversely, let S be a nonempty
subset of F that satisfies conditions 1 and 2. Conditions (a), (b), (e), and (f) of the definition
of a ring are inherited by the elements of S$, because they are also elements of R. Thus, all
we need to verify here is that S has an additive identity. Now S # 4, so there is an element
a € S, and by condition 2, - € S. Then by condition 1, z = a+ (—a)€S.
Consider the ring (Z, @, ©) that we examined in Example 14.3 and Exercise 5 of Section
EXAMPLE 14.10
14.1. Here we have x @ y=x+y-—landx © y=x+y-—xy. Now consider the subset
S={...,—-5, —3, -1, 1, 3,5, ...} of all odd integers. Since, for example, 3 and 5 are in
S but the ordinary sum 3 + 5 = 8 ¢ S, this set §$ is not a subring of (Z, +, -). However,
365=3+5-1=7€ES. In fact, for all a,b Ee S we have a@Gb=a-+b—1, where
a+b is even, anda+b-—1 is odd—soa@besS. Also, aOb=a+b-—ab, where
a+b is even and ab is odd—soa ObeE S. Finally, —a [the additive inverse of a in the
ring (Z, @, ©)] is equal to 2 — a, which is odd whenever a is odd. Consequently, if a € S
then —a € S, and it follows from Theorem 14.9 that S is a subring of (Z, @, ©).
Note that (Z*, +, -) satisfies condition 1 in Theorem 14.9, but not condition 2, so it is
not a subring of (Z, +, -).
The result in Theorem 14.9 can also be given as follows.
THEOREM 14.10 For any ring (R, +, -), if ASCR,
a) then (S, +, -) is a subring of R if and only if for all a, b € S, we have a — b € S and
abe S;
b) and if S is finite, then (S, +, -) 1s a subring of R if and only if for all a, b € S, we have
a+b, ab €S. (Once again, additional help comes from a finiteness condition.)
Proof: These proofs we leave for the reader.
The next example demonstrates how one might use the first part of the preceding theorem.
Let us consider the ring R = M2(Z) and the subset
EXAMPLE 14.11
x x+y
Xx, vez!
x+y Xx
684 Chapter 14 Rings and Modular Arithmetic
0
of R. When x = y = Oit follows that | € S,and S 4 %. So now we examine any two
0 0
elements of S — namely, two matrices of the form
x x+y and v v+w
x+y x v+w v ,
where x, y, v, w € Z. We find that
x x+y] v v+w] _ x—v (x -—v) +(y-w)
x+y x v+w v la -v4t(iy-w) x—vU ,
so S$ is closed under subtraction. Turning to multiplication we have
x x+y v v+w
x+y x v+w v
_f|xut(rt+y\u+w) x(v+w)t(xt+y)v
(x+y)jvtx(iv+tw) («t+y)\(vtw)txv
_|xv +xu+yutxw+ yw Xv+txw+xut yv
xXu+tyvtxu+xw XU + yu+txw+ywrrv
_ xu+xutyutxw+ yw (xv +xu+ yu+txw+ yw) +(-yw)
(xu +xv+ yu+txw+t yw)4+ (—-yw) xvu+txvu+yutxw+ yw
so S is also closed under multiplication.
Appealing now to part (a) of Theorem 14.10, one finds that S is a subring of R.
We shall now single out an important type of subring.
Definition 14.6 Anonempty subset / ofa ring R is called an ideal of R if for alla, b € J and allr € R, we
have (a)a —be TJ and(b)ar,rae Tl.
An ideal is a subring, but the converse does not necessarily hold: (Z, +, +) is a sub-
ring of (Q, +, -) but not an ideal because, for example, (1/2)9 ¢ Z for (1/2) € Q, 9 €Z.
Meanwhile, all the subrings in Example 14.8(a) are ideals of (Z, +, -).
Looking back to Example 14.10 we see that ifa ¢ S,x € Z, thena Ox =~a+x-—ax
(= x © a), and if x is even (because the case for x odd has already been covered within
Example 14.10), then a + x is odd and ax is even, making a + x — ax odd. Consequently,
for alla € S andallx € Z,a@©x andx ©a arein S, so S is an ideal of the ring (Z, @, ©).
and B~'A~! if
EXERCISES 14.2 Az k i] R= E |
1 2 2 1
1. Complete the proofs of Theorems 14.2, 14.4, 14.5, and
14.10. 4. Prove that a unit in a ring R cannot be a proper divisor of
2. If a, b, and c are any elements in a ring (R, +, +), prove ZE10.
that (a) a(b — c) = ab — (ac) = ab — ac and (b) (b — c)a = 5. Ifa is a unit in ring R, prove that —a is also a unit in RX.
ba — (ca) = ba — ca.
6. a) Verify that the subsets S$ = {s, w} and T = {s, v, x}
3. a) If R is a ring with unity and a, b are units of R, prove are subrings of the ring R in Example 14.6. (The binary
that ab is a unit of R and that (ab)"! = b'a7!. operations for the elements of S, 7 are those given in
b) For the ring M2(Z), find A~', B-', (AB)~', (BA), Table 14.3.)
14.2. Ring Properties and Substructures 685
b) Are the subrings in part (a) ideals of R? a) Verify that R is a field,
7, Let S and T be subrings of aring R. Prove that $M T isa b) Find a subring of R that is not an ideal.
subring of R. c) Let x and y be unknowns, Solve the following system
8. Let R = M,(Z) and let S be the subset of R where of linear equations in R: bx + y =u; x + by =z.
s={[.2, “5”
17, Let R be a commutative ring with unity uw.
nye Zl
a) For any (fixed)a € R, prove thataR = {ar|r € R}is an
Prove that S is a subring of R. ideal of R.
9, Let (R, +, -) be a ring. If S, 7,, and 7> are subrings of R, b) If the only ideals of R are {z} and R, prove that R is a
and S$ C 7, U Th, prove that S C 7; or $ CT). field.
10. a) Let (R, +, -) be a finite commutative ring with unity uw. 18. Let (S, +, +) and (7, +’, «’) be two rings. ForR = S X T,
Ifr € R and r is not the zero element of R, prove that r is define addition “@” and multiplication “©” by
either a unit or a proper divisor of zero.
(1, t1) B G2, 2) = (81 + 52, +b),
b) Does the result in part (a) remain valid when R is in-
finite? (S1, t1) © (Sa, fa) = (81 + 82, fy" t).
11. a) For R = M>(Z), prove that a) Prove that under these closed binary operations, R is a
[oo
ring.
S= aez| b) If both S and T are commutative, prove that R is com-
0 0
mutative.
is a subring of R.
c) If S has unity us and T has unity wy, what is the unity
b) What is the unity of R?
of R?
c) Does § have a unity?
d) If S and T are fields, is R also a field?
d) Does S have any properties that R does not have?
19, Let (R, +, +) be a ring with unity uw, and |R| = 8. On
e) Is S an ideal of R? R*=RX RX RX R, define + and - as suggested by Exer-
12. Let Sand T be the following subsets of the ring R = M2(Z): cise 18. In the ring R*, (a) how many elements have exactly two
ales
nonzero components? (b) how many elements have all nonzero
a,b,ce z} ; components? (c) is there a unity? (d) how many units are there
{ft
if R has four units?
abcd eZ}. 20. Let (R, +, +) bearing, witha € R. Define 0a = z, la =a,
and (n + l)a = na +a, forall n € Z*. (Here we are multiply-
a) Verify that S is a subring of 2. Is it an ideal? ing elements of R by elements of Z, so we have yet another
b) Verify that T is a subring of R. Is it an ideal? operation that is different from the multiplications in either of
Z or R.) For n > 0, we define (—n)a = n(—a), so, for ex-
13. Let (R, +, -) be a commutative ring, and let z denote the
ample, (—3)a = 3(—a) = 2(—a) + (—a) = [(—a) + (-a)} +
zero element of R. For a fixed element a € R, define N(a) =
(—a) = [-(a+a4)]+ (-a) = -[(a+a) +a] =
{r € R|ra = z}. Prove that N (a) is an ideal of R.
— {2a +a] = —(3a).
14, Let R be a commutative ring with unity u, and let J be an For alla, b € R, and all m, n € Z, prove that
ideal of R. (a) If u € 7, prove that J = R. (b) If J contains a unit
a) ma+na=(m+n)a b) m(na) = (mn)a
of R, prove that J = R.
15, If R is a field, how many ideals does R have? c) n(at+tb)=nat+nb d) n(ab) = (na)b = a(nb)
16. Let (R, +, -) be the (finite) commutative ring with unity e) (ma)(nb) = (mn)(ab) = (na)(mb)
given by Tables 14.6(a) and (b). 21. a) For ring (R, +, -) and each a € R, we define a! = a,
and a"t! = a"a, for all n € Z*. Prove that for all m,n
Table 14.6 € Zt, (a")(a") = a" and (a”)" = a™".
+ | z u a b Zz u a b b) Can you suggest how we might define a” or a~", n
€ Z*, including any necessary conditions that R must
Zz | Zz u a b z | z Zz Zz Zz satisfy for these definitions to make sense?
u Zz b a u | z u a b
ala b Zz u alz a b u
bib a u Zz b | z b u a
(a) (b)
686 Chapter 14 Rings and Modular Arithmetic
14,3
The Integers Modulo n
Enough abstraction for a while! We shall now concentrate on the construction and use of
special finite rings and fields.
Definition 14.7 Let n € Z*,n > 1. Fora, b € Z, we say that a is congruent to b modulo n, and we write
a =b (mod n), if n|(a — b), or, equivalently, a = b + kn for some k € Z.
i) We find that 17 = 2 (mod 5), since 17 — 2 = 15 = 3(5), or 17 = 24 3(5).
EXAMPLE 14.12
ii) As —7 + 49 = —7 — (—49) = 42 = 7(6) [or, —7 = —49 + 7(6)], we have
—7 = —49 (mod 6).
iii) Since 11 — (—5) = 16 = 2(8) [or, 11 = —5 + 2(8)],
it follows that 11 = —5 (mod 8).
Before we examine our first theorem, let us make three observations about this new
concept of congruence modulo n. Here, as above, we have a, b, n € Z, withn > 1.
i) Using the division algorithm, we can write a = qin +17, and b = gon + rz, with
O<r, <n,O <r <n. Soa —b= (q, — q2)n4t (7) — 12). Then, ifa = b (moda),
it follows that n|(a — b), and, consequently, »|(7; — r2). But with0 < |r; — ro] <n,
we now find that 7; = ro.
Hence, if a = b (mod n), then a, b have the same remainder upon division by n.
ii} The converse of the result in (i) is also true. That is, if@ = gjn +r; andb = gon +r,
with r) = ro, then a — b = (gq; — g2)n anda =b (mod n).
iii) Althougha = b > a =b(mod~n), we cannot expect a = b (modn) > a = b. How-
ever, if a=b (modn) anda, be€ {0,1,2,...,n—1}, thena = b.
THEOREM 14.11 Congruence modulo n is an equivalence relation on Z.
Proof: The proof is left for the reader.
Since an equivalence relation on a set induces a partition of that set, forn > 2, congruence
modulo n partitions Z into the n equivalence classes
[0] ={..., —2n, —n, 0, n, 2n,...) = {O+nx|x € Z}
€Z}
[lJ={...,-2n4+1,-n4+1,1,424+1,20241,...}={l+nx|x
[2] ={...,
-2n +2, -n4+2,2,n+2,2n+2,...} ={2+nx|x € Z}
[In —1l] ={..., -—n —1, -l,n—1,2n—-—1,3n-—1,...}
= {(n —1)+nx|x €Z}.
For all t € Z, by the division algorithm (of Section 4.3) we can write f = gn +r, where
O<r<n, sot €l[r], or [tf] =[r]. We use the notation Z, to denote {[0], [1], [2],...,
[7 — 1]}. (When there is no danger of ambiguity, we often replace [a] by a and write
Z, = {0, 1, 2,..., — 1}.) Our objective now is to define closed binary operations of
addition and multiplication on the set Z,, of equivalence classes so that we obtain a ring.
14.3. The Integers Modulon 687
For [a], [b] € Z,,, define + and - by
[a] + [b] =[a+b6] and [a]: [b] = [a][b] = [ab].
For example, ifn = 7, then [2] + [6] = [2 + 6] = [8] = [1], and [2][6] = [12] = [5].
Before these definitions are so readily accepted, we must investigate whether or not
these (closed binary) operations are well-defined in the sense that if [a] = [c], [b] = [d],
then [a] + [b] = [ce] + [d] and [a][b] = [c][d]. Since [a] = [c] can occur with a # c, do
the results of our addition and multiplication depend on which representatives are chosen
from the equivalence classes? We shall prove that the results of the two operations are
independent of the choice of class representatives and that the operations are very definitely
well-defined.
First, we observe that [a] = [c] + a =c+ sn, for some s € Z, and [b] = |d] > b=
d+tn, for some t € Z. Hence
at+b=(c+sn)+(d+tn)=c+d4+(s+t)n,
so (a + b) =(c +d) (mod n) and [a + b] = [c +d]. Also,
ab =(c+sn)(d +tn) =cd+(sd+ct+stn)n
and ab = cd (mod n), or [ab] = [cd].
This result now leads us to the following.
THEOREM 14.12 For 2 € Zt, n > 1, under the closed binary operations defined above, Z,, is a commutative
ring with unity [1] (and additive identity [OJ).
Proof: The proof is left to the reader. Verification of the ring properties follows from the
definitions of addition and multiplication in Z,, and from the corresponding properties for
the ring (Z, +, -).
Before stating any further results, let us examine two particular examples, Zs and Zo. In
Tables 14.7(a) and (b) and 14.8(a) and (b), we simplify [a] by writing a.
Table 14.7
Zs\ + | 0 1 2 3 4 0 1 2 3 4
0 | 0 1 2 3 4 0| 0 0 0 0 0
1/1 2 3 4 0 1|0 1 2 3 4
2 \2 3 4 0 ! 2) 0 2 4 1 3
3 |3 4 0 l 2 3 | 0 3 1 4 2
4 | 4 0 1 2 3 4/0 4 3 2 1
(a) (b)
In Z; every nonzero element has a multiplicative inverse, so Zs is a field. For Ze,
however, | and 5 are the only units and 2, 3, 4 are proper divisors of zero. Meanwhile, in
Zo, 3.3 =3-6=0, so 3 and 6 are proper divisors of zero. Consequently, for Z,,, n > 2,
to be a field, we need more than just an odd modulus.
688 Chapter 14 Rings and Modular Arithmetic
Table 14.8
Zo| + | 0 1 2 3 4 5 0 ] 2 3 4 5
0 | 0 l 2 3 4 5 0 | 0 0 0 0 0 0
1 | 1 2 3 4 3 0 1 | 0 1 2 3 4 5
2 | 2 3 “ 5 0 1 2/0 2 4 0 2 4
3 | 3 4 5 0 l 2 3.| 0 3 0 3 0 3
4|4 5 0 1 2 3 4/0 4 2 0 4 2
5 | 5 0 1 2 3 4 5 | 0 5 4 3 2 1
(a) (b)
THEOREM 14.13 Z,, is a field if and only if » is a prime.
Proof: Let 1 be a prime, and suppose that 0 < a < n. Then gcd(a, n) = 1, so as we learned in
Section 4.4 there are integers s, ¢ with as + tn = 1. Thus as = 1 (mod n), or [a][s] = [1],
and [a] is a unit of Z,,, which is consequently a field.
Conversely, if7 is not a prime, then 7 = njn2, where 1 < 11,2 <n. So [n,] 4 [0] and
[m2] 4 [0] but (7) ]L[z2] = [n1n2] = [0], and Z, is not even an integral domain, so it cannot
be a field.
In Ze, [5] 1s a unit and [3] is a zero divisor. We seek a way to recognize when [a] 1s a
unit in Z,, for m composite.
THEOREM 14.14 In Z,,, [a] is a unit if and only if gcd(a, n) = 1.
Proof: If gcd(a, n) = 1, the result follows as in the proof of Theorem 14.13. For the con-
verse, let [a] € Z, and [a]~! = [s]. Then [as] = [a][s] = [1], so as = 1 (modn) and as =
1+ tn, for some t € Z. But 1 = as +n(—-t) > ged(a,n) = 1.
Find (25]7! in Z:.
EXAMPLE 14.13
Since gcd(25, 72) = 1, the Euclidean algorithm leads us to
72 = 2(25) + 22, 0< 22 < 25
25 = 1(22) +3, 0<3<22
22 = 7(3) +1, O<1 <3.
As 1 is the last nonzero remainder, we have
| = 22 — 7(3) = 22 — 7[25 — 22] = (—7)(25) + (8)(22)
= (—7)(25) + 8[72 — 2(25)] = 8(72) — 23(25).
But
1 = 8(72) — 23(25) > 1 = (—23)(25) = (—23 + 72)(25) (mod 72),
so [1] = [49][25] and [25]~! = [49] in Z72.
In addition, from this result we are now able to solve the following linear congruences
for x:
14.3 The Integers Modulo n 689
1) If 25x = 1 (mod 72), then x = 49 (mod 72).
2) If 25x = 3 (mod 72), then x = 49 - 3 (mod 72) = 3 (mod 72).
Now [25] is a unit in Z72, but is there any way of knowing how many units this ring
has? From Theorem 14.14, if 1 <a@ < 72, then [a]~! exists if and only if gcd(a, 72) = 1.
Consequently, the number of units in Z72 1s the number of integers a, such that 1 <a < 72
and gcd(a, 72) = 1. Using Euler’s phi function (Example 8.8), we find that this is
(72) = $(2°3*) = (72)f1 — (1/2)11 — (1/3)] = (72) (1/2)(2/3) =
In general, for anyn € Z*,n > ’ tere 6) tis and
adn 1 $m pr
of zero in Z,. ‘ x
Before we continue with some examples where congruence plays a role, we want to look
back at the binary operation mod that was introduced earlier in Examples 4.36 and 10.8. In
those examples we considered x, y € Z* and defined x mod y as the remainder obtained
when we divide x by y. At this point we shall extend this concept to include the case where
x <0. Hence, forx € Zand y € Z*, x mod y is the remainder that results upon division of
x by y.
But, now, how is mod related to the mod of Definition 14.7? Here we find that if
a,b,n € Z,withn > 1,thena = b (modn)ifandonlyifa mod n = b mod n. (This follows
from the observations we made prior to Theorem 14.11.)
And now the time has arrived for some additional examples.
Randomly generated numbers arise in many applications. In particular, they are often used
EXAMPLE 14.14
for the computer simulation of experiments that are too expensive, too dangerous, or just
plain impossible to conduct in the real world.
The idea of using a computer to generate random numbers was first developed by John
von Neumann (1903-1957) in 1946. However, although these numbers may appear to be
random, they are not — hence the title pseudorandom numbers.
Proposed in 1949 by Derrick H. Lehmer (1905-1991), the most commonly used tech-
nique for generating such pseudorandom numbers employs the notion of congruence. For
the /inear congruential generator, one starts with the four integers: the multiplier a, the
increment c, the modulus m, and the seed xp, where
2<a<m, O<c<m, and 0 <x)
< mm.
These nonnegative integers are used to generate a sequence of pseudorandom numbers,
X1, X2, X3,..., recursively, by
Xn+1 = (AX, +c) mod m.
So 0 < x,4) <m, forn > 0. For example, ifa = 3,c = 2, m = 11, and xp = 1, then
x; = (axg + c) mod m = [3(1) + 2] mod 11 = 5, sox; =5.
Similarly, x. = (ax; + c) mod m = [3(5) +2] mod 11 = 17 mod 11 = 6, so x2 = 6.
690 Chapter 14 Rings and Modular Arithmetic
Continuing in this manner, one finds that x3 = 9, x4 = 7, and x5 = 1, the seed. Conse-
quently, this linear congruential generator produces five distinct integers before repeating.
The sequence of pseudorandom numbers thus obtained is 1, 5, 6, 9, 7, 1, 5, 6,....
Witha = 3,c =5,m = 12, and xp = 6, we first learn thatx, = [3(6) + 5] mod 12 = 11,
so x; = 11. Next, x2 = [3(11) + 5] mod 12 = 38 med 12 = 2, so x2 = 2. Further compu-
tation yields x3 = 11. This time the linear congruential generator yields only three dis-
tinct integers before repeating. The sequence of pseudorandom numbers generated here is
6, 11, 2, 11, 2, 11, 2,..., where the seed is not repeated.
In practice large values for a and m are used— especially for critical simulations. For
a = 16,807 (= 7°), c = 0, m = 2,147,483,647 (= 2°! — 1, a prime), and x9 = 1, one ob-
tains a sequence of 2,147,483,647 pseudorandom numbers before a repeated integer
appears.
a) Whether it’s youngsters using decoder rings or military leaders sending battle plans
EXAMPLE 14.15
to troops, throughout history, various people have wanted to keep certain information
unintelligible, should it fall into the wrong hands.
As early as the first century B.C., the Roman general Gaius Julius Caesar (100 B.c—
44 B.C.) used a cipher shift to make the contents of certain messages understandable
only for those he intended the messages to reach. To describe this early form of cryp-
tosystem— often termed the Caesar cipher— we shall make certain conventions to
simplify the presentation. First, we shall write the original message, the plaintext,
using only lowercase letters, with no punctuation or spaces. Then to encrypt the plain-
text, each lowercase letter, from a to w, is shifted to the letter three places forward in
the alphabet, and the last three letters — namely, x, y, and z —are shifted to the first
three letters, respectively. We use the uppercase letters for the resulting ciphertext.
Consequently, a is encrypted as D,bas E,cas F,..., jasM,...,masP,...,y
as B,andzasC.
If Caesar wanted to inform a senator in Rome of a recent victory, he might have
sent the message “I came, I saw, I conquered.” Encryption of this message takes place
as follows:
Plaintext i c am eis awioconguered
Ciphertext L F DPHAHALV DZLFRQTxX
HUEHAG
Upon receiving the ciphertext, as long as this senator knows the size and direction
of the shift, he can reverse the process. Decryption then results by replacing each
uppercase letter, from D to Z, in the ciphertext by the lowercase letter three places
back in the alphabet, and A by x, B by y, and C by z. After decrypting, one then
inserts the appropriate spaces and punctuation in the plaintext. (Note that by removing
spaces in the plaintext, the resulting absence of spaces in the ciphertext helps make
the message more unintelligible. If one does not know the size and direction of the
shift for decryption, the presence of spaces may suggest certain information about the
structure of the original message.)
b) The idea of Caesar’s cipher can be generalized and modeled mathematically by using
the concept of congruence. Start by assigning each of the 26 letters of the plaintext a
nonnegative integer as shown:
abe ds+++ k € m nN «ss Ww x yp Z
0 1 2 3. --- 10 IL 12 13 --- 22 23 24 25
14.3. The Integers Modulo n 691
The 26 letters for the ciphertext are assigned the same integers — that is, A is assigned
0, B is assigned 1, ..., Y is assigned 24, and Z is assigned 25.
Now select a nonnegative integer «, where 0 < x < 25. For instance, Caesar chose
k = 3. This integer « is called the key and helps us define the encrypting function
E: Zx, — Zag as follows. Given a letter of the plaintext, let 6 be the nonnegative integer
to which it corresponds. Then £(6) = (@ + «) mod 26 and this result determines the
corresponding ciphertext letter for the plaintext letter assigned the nonnegative integer
6. To decrypt we apply the inverse function D: Z26 — Zog where we write D(@) =
(8 — x) mod 26. Replacing each nonnegative integer with its corresponding plaintext
letter, one captures the plaintext version of the original message.
If we do not know the key, a trial-and-error approach can be used. There are 26
possibilities — one for each of the 26 possible values of «. A more efficient method of
attack takes into account the most frequently occurring letters in the alphabet and the
most frequently occurring letters in the ciphertext. In the English language, the letter
e occurs most often, with t, a, o andi the next four most frequently occurring letters.
Now if a parent receives the ciphertext Z LU K TV YLTVULF
from a college student, and does not know the key, what can the parent do? Since
the most frequently occurring letter in the ciphertext is L, the parent can corre-
spond e with L under the encryption. This suggests that F: Z216 > Zog be defined
by E(6) = (6 + 7) mod 26, since L is seven places after e in the alphabet. So here the
key, «,is 7 and the decryption function is D: Zog — Z26 with D(@) = (6 — 7) mod 26.
Decoding the ciphertext message received by the parent can be analyzed as follows:
(q) ZL U K T V Y LT V UL F
(2) 25 11 20 10 19 21 24 11 19 21 20 11 5
(3) 18 4 13 3 12 14 17 4 12 14 13 4 ~«24
(4) s e n demo or e m o n e y
Here (1) provides the given (encrypted) ciphertext. In (2) each ciphertext letter is re-
placed by the nonnegative integer assigned to it. Upon applying the decryption function
D, the results in (2) provide the assignments in (3). Replacing each nonnegative integer
in (3) by its corresponding plaintext letter yields the original message
“Send more money.”
c) The security of the shift cipher in part (b) can be slightly enhanced by means of
the affine cipher. The letters of the plaintext and ciphertext are assigned nonnegative
integers, as in part (b). Here, however, the encryption function EF is given by E(@) =
(a0 + «) mod 26, where 0 <a, « < 25, and gced(a, 26) = 1.
If 6), 62 € Zoe, then E(6,;) = E(02) > (a6, + «) mod 26 = (a4. + «) mod 26 >
a6, mod 26 = w@; mod 26 > 6; = 62, by Theorem 14.14. So E is one-to-one. Fur-
ther, E is also onto and invertible, by Theorem 5.11, because Zy¢ is finite.
Let us consider a specific example. Suppose a = 11 and « = 7. Then the encryption of
the plaintext letter g proceeds as follows:
i) g is assigned the nonnegative integer 6;
ii) applying E, we have E(6) = (11-6+ 7) mod 26 = 73 mod 26 = 21; and
iii) the nonnegative integer 21 determines the ciphertext letter V.
692 Chapter 14 Rings and Modular Arithmetic
[So using this affine cipher, where E(0) = (116 + 7) mod 26, the plaintext letter g is en-
crypted as the ciphertext letter V.]
Now suppose we have the following ciphertext for a message encrypted by an affine
cipher:
QYYFGCULBLKYZVOSTCOY PURGCULYZYWKYOSTCOYL
With no knowledge of a or «, one might have to examine as many as [(26)](26) =
[26 (1 — 3) (1 — 35)] (26) = [26 (4) (8)] (26) = (12)(26) = 312 cases for the key a, k.
However, let’s say that by some means — perhaps by considering the frequencies of oc-
currence for the letters in the plaintext and ciphertext— we deduce two correspondences.
Specifically, we know that e and Y correspond, as do t and R. In addition, the nonnegative
integers 4 and 19 are the replacements for the plaintext letters e and r, respectively, while
24 and 17 are the respective replacements for Y and R, in the ciphertext, so the encryption
function E is determined as follows:
1) The correspondence of e(4) and Y (24) tells us that E(4) = (4a + «) mod 26 = 24.
2) The correspondence of t(19) and R(17) tells us that E(19) = (19a + «) mod 26 = 17.
Consequently, £(19) — E(4) = [(19a + «) — (4a + «)| mod 26 = 15a mod 26 =
(17 — 24) mod 26 = —7 mod 26 = 19. Since 15-7 = 105 = 14+ 104 = 1 + 4(26), we
have 15-7 = 1 mod 26, so 157! = 7 (in Z6). Then 15a = 19 mod 26 >
a = 15~'- 19 mod 26 = 7- 19 mod 26 = 133 mod 26 = 3, as 133 = 3 + 5(26).
With a = 3 mod 26 it now follows from (1) that « = (24 — 4a) mod 26 =
(24 — 12) mod 26 = 12. [Or, from (2), « = (17 — 19a) mod 26 = (17 — 57) mod 26 =
—40 mod 26 = 12.]
Consequently, EF: Z25 — Zo is defined by E(@) = (36 + 12) mod 26 and the decryp-
tion function D: Z2 — Zr is given by D(6) = (96 +22) mod 26, since E~'(@) =
3~'(6 — 12) mod 26 = 9( — 12) mod 26 = (96 — 108) mod 26 = (96 + 22) mod 26.
This function D is used in the following to obtain the results in row 3 from the nonnega-
tive integers (that replace the ciphertext letters) in row 2.
(1) Ciphertext Q Y Y F GC U LBL K Y ZVOS TC OY
(2) 16 24 24 5 6 2 20 11 1 IL 10 24 25 21 14 18 19 2 14 24
(3) 10 4 4 15 24 14 20 17 5 17 8 4 13 3 18 2 11 14 18 4
(4) Plaintext k e e p your friends ¢ €@ @ 5s e
(1) Ciphertext P U R GCU LY ZYWK ¥Y O§ TCOYL
(2) 15 20 17 6 2 20 II 24 25 24 22 10 24 14 18 19 2 14 24 Il
(3) 1 20 19 24 14 20 17 4 13 4 12 8 4 18 2 11 14 18 4 17
(4) Plaintext b uw tf yo ur enemies
¢ @o gs er
Here, for example, the ciphertext letter Q is replaced by the nonnegative integer 16.
Applying the decryption function D to 16 we have D(16) = (9- 16+ 22) mod 26 =
166 mod 26 = 10, and 10 is the nonnegative integer that corresponds to the plaintext
letter k.
The decrypted message now reveals the sage advice given by Don Vito Corleone (of
Mario Puzo’s The Godfather) to his youngest son, Michael — namely, “Keep your friends
close but your enemies closer.”
14.3. The Integers Modulo n 693
The security of each of the cryptosystems in Example 14.15 depends on the key [x = 3
in part (a), « in part (b), and a, « in part (c)]. For such private key cryptosystems, the
two people wishing to use the system need to securely exchange the key. Should any
unauthorized person discover the key, then that person could readily encrypt or decrypt
messages.
Our next example deals with modular exponentiation.
| EXAMPLE 14.16 In the study of cryptology” one often needs to perform modular exponentiation to compute
a result such as b* mod n, where b, e, and n are large integers. To demonstrate this — on
a somewhat smaller scale—let us determine 5'*3 mod 222. We realize that it is rather
inefficient to actually compute 5'*? (a very large integer) and then find the remainder
upon dividing the result (for 5'**) by 222. A more efficient approach starts with the binary
representation for the exponent — here, 143. With
143 = 1(128) + 0(64) + 0(32) + 0(16) + 1(8) + 1(4) + 1(2) +101)
1(27) + 0(2°) + 0(2°) + 0(2*) + 1(23) + 1(27) + 102!) + 12°)
(10001111)>,
we compute 5143 mod 222 by using the binary representation (of 143) in reverse order —
that is, going from the right to the left. The pseudocode procedure in Fig. 14.1 provides
the necessary steps for this computation. Here the input is an integer b, the positive integer
n (the modulus), and the binary representation (@y,@,,_; - + - @42@\a9)2 for the exponent e,
another positive integer. The output x equals b° mod n.
procedure ModularExponentiation(b: integer;
nN, @= (@n@n-1''°A241a0)2: positive integers)
begin
xX :=l1
power := bmodn
for i=0tomdo
begin
if a, =1thenx := (x * power) modn
power := (power * power) modn
end
end
Figure 14.1
For our example, b = 5, e = 143 = (10001111). = (ajagas - - - a2a,ag)2 [So m = 7],
and n = 222. The results in Table 14.9 show us the steps that are followed in the execution
of the for loop. This is after the initial assignments are made: x is 1 and power is b mod n —
that is, 5 mod 222 = 5.
Following the execution of this procedure, the last entry in the column for x tells us that
5'49 mod 222 is 89.
*For more on cryptology (and related topics), the reader should find the references by T. H. Barr [3], P. Garrett
[6], and W. Trappe and L. C. Washington [13] of interest.
694 Chapter 14 Rings and Modular Arithmetic
Table 14.9
tL | a x power
oO} 1 lx5=5 5* (= 25) mod 222 = 25
1] 1 5 «25 mod 222 = 125 25° (= 625) mod 222 = 181
2] 1 | 125* 181 mod 222 = 203 | 181° (= 32761) mod 222 = 127
3. | 1 | 203 * 127 mod 222 = 29 | 1277 (= 16129) mod 222 = 145
4] 0 29 145? (= 21025) mod 222 = 157
5| 0 29 157? (= 24649) mod 222 = 7
6| 0 29 7? (= 49) mod 222 = 49
7/1 29 x 49 mod 222 = 89 49? (= 2401) mod 222 = 181
The next example provides an application of modular congruence in information
retrieval.
When searching a table of records stored in a computer, each record is assigned a memory
EXAMPLE 14.17
location or address in the computer’s memory. The record itself is often made up of fields
(this has nothing to do with ring structures). For instance, a college registrar keeps a record
on each student, with the record containing information on the student’s social security
number, name, and major, for a total of three fields.
In searching for a particular student’s record, we can use his or her social security number
as the key to the record because it uniquely identifies that record. As a result, we develop a
function from the set of keys to the set of addresses in the table.
If the college is small enough, we may find that the first four digits of the social security
number are enough for identification. We develop a hashing (or scattering) function h from
the set of keys (still social security numbers) to the set of addresses, determined now by the
first four digits of the key. For example, 4(081-37-6495) identifies the record at the address
associated with 0813. In this way we can store the table using at most 10,000 addresses.
All is well as long as d is one-to-one. Should a second student have social security number
081-39-0207, then # would no longer uniquely identify a student’s record. When this hap-
pens, a collision is said to occur. Since increasing the size of the stored table often results in
more unused storage, we must balance the cost of this storage against the cost of handling
such collisions. Techniques for resolving collisions have been devised. They depend on the
data structures (such as vectors or linear linked lists) that are used to store the records.
Different kinds of hashing functions that have been developed include the following.
a) The division method: Here we restrict the number of addresses we want to use to
a fixed integer n. For any key k (a positive integer), we define h(k) =r, where r =
&k mod
n — that is, r =k (modn) andQ <r <n.
b) Often implemented is the folding method, where the key is split into parts and the
parts are added together to give A(key). For example, 4(081-37-6495) = 081 + 37+
6495 = 6613 utilizes folding, and if we want only three-digit addresses, suppressing
the first digit 6, we can have 4(081-37-6495) = 613.
The importance of choosing a pertinent hashing function cannot be emphasized enough
as we try to improve efficiency in terms of greater speed and Jess unused storage.
14.3. The Integers Modulo n 695
Using the modular concept, we can develop a hashing function h, using the same keys
as above, where
A(x | X2X3-X4X5-X6X7XgX9) = Yr N23,
with
yy = (41 + x2 +3) mod 5
y2 = (x4 + x5) mod 3
y3 = (%6 + x7 + Xg + xo) mod 7.
Here, for example, #(081-37-6495) = 413.
Our last example for this section provides one more encounter with the Catalan numbers
(of Sections 1.5 and 10.5).
In how many ways can we select three elements a, b, c from {0, 1, 2, 3}, if repetitions
EXAMPLE 14.18
are allowed and we want a + b + c=0 (mod 4)? The selections are listed in column 1 of
Table 14.10. (Here each selection sums to 0, 4, or 8, and order is not relevant. For instance,
a=0,b=1,c =3 is considered the same selection as a = 1, b = 0, c = 3.) We see that
there are five such selections and we recall that 5 = (<4) (73°), the third Catalan number.
Furthermore, by adding 1 to each entry of the selection 0, 0, 0 (in row 1 and column 1) we
obtain the selection 1, 1, 1 (in row 1 and column 2). Likewise, the selection 2, 3, 1 (inrow 2
and column 3) arises by adding 2 to each entry of the selection 0, 1, 3 (in row 2 and column
1) and reducing each sum modulo 4. Similar computations provide the other 13 selections
in columns 2, 3, 4.
Table 14.10
Sum Is 0 (mod 4) | SumIs3 (mod 4) | SumIs2 (mod 4) | Sum Is 1 (mod 4)
0, 0, 0 1,1,1 2,2,2 3, 3,3
0, 1,3 1, 2,0 2, 3,1 3, 0, 2
0, 2, 2 1, 3,3 2, 0,0 3, 1,1
1,1,2 2, 2,3 3, 3,0 0, 0,1
2, 3,3 3, 0, 0 0, 1,1 1, 2,2
To generalize this result, we count the number of selections x|, x2, ..., X,, from {0, 1, 2,
3,...,n}, where repetitions are allowed and x; + x2 +---+x, =O (mod n + 1). From
Section 1.4 we know there are ("+1 +"~1) = (*) ways to select n objects from 2 + |
distinct objects, with repetitions allowed. Let Se/,, denote the set of these (7") selections.
(The 20 selections in Table 14.10 illustrate Se/3.) Define the relation # on Sel, by s; KR so,
if the sum of the entries in selection s; is the same, modulo n + 1, as the sum of the entries
in selection s2. Then & is an equivalence relation, so Sel, can be partitioned into n + 1
equivalence classes (one for each of the selection sums 0, 1, 2,...,2—taken modulo
n+ 1). [Note: We get all n + 1 possible selection sums, for if 0< k; <n, 0 <k2 <n, and
nk; =nky (mod n + 1), then kj =k2 (mod n+ 1). This is due to Theorem 14.14 since
gcd(n, n+ 1) = 1. With ky, ko € {0, 1, ..., m} it then follows that k, = k2.]
For 0<s <n, let Sel* denote the selections that sum to s, modulo n+ 1. When
l<s<n, write s=nk (for k=n~'s). Define f: Sel? > SelS as follows. For
696 Chapter 14 Rings and Modular Arithmetic
{x}, x2,..., Xn} € Sel?, fCUx1, X2,.. ~oXn}) = fe, tk, xo tk,..., xX, +k}, where
x; +k is reduced modulo n+ 1. Now consider {yj, yo,. ., Yn} € Sel} and define
g: Sel’ > Sel® by g({y1, y2,-- JY) ={yit@t+l—k)rt+m4+1—k),...,
Yn + (n + 1 — k)}. One finds
that g = f—! so | Sel? | = |Sel,| See = | Sel” |. Consequently,
each equivalence class has the same size, namely, (5) (2"), the nth Catalan number.
12. Find the multiplicative inverse of each element in Z),, Z)3,
EXERCISES 14.3 and Z).
1. a) Determine whether each of the following pairs of inte- 13. Find [a]~! in Zyoy9 for (a) a = 17, (b) a = 100, and
(c)a = 777.
gers is congruent modulo 8.
i) 62,118 ii) —43, —237 iii) —90, 230 14, a) Find all subrings of Z,2, Z,g, and Zo4.
b) Determine whether each of the following pairs of inte- b) Construct the Hasse diagram for each of these collec-
gers is congruent modulo 9. tions of subrings, where the partial order arises from set
inclusion. Compare these diagrams with those for the set
i) 76, 243 ii} —137, 700 iii} —56, —1199
of positive divisors of n (n = 12; 18; 24), where the partial
2. For each of the following determine the value(s) of the in- order now comes from the divisibility relation.
teger n > 1 for which the given congruence is true. c) Find the formula for the number of subrings in Z,,n > 1.
a) 28 =6 (mod n) b) 68 = 37 (mod n)
15. How many units and how many (proper) zero divisors are
c) 301 = 233 (mod nv) d) 49 =2 (mod n) there in (a) Zy7? (b) Z417? (c) Zi117?
3. List four elements in each of the following equivalence 16. Prove that in any list ofn consecutive integers, one of the
classes. integers is divisible by n.
a) [1] in Z, b) [2] in Zi; c) [10] in Z)7 17. If three distinct integers are randomly selected from the set
4. Prove thatifa, b,c,n € Z witha, n > O, and {1, 2,3,..., 1000}, what is the probability that their sum is
b=c (modn), then ab =ac (modan). divisible by 3?
5. Leta, b,} m,n © Zwith m,n > O. Prove that 18. a) For c, d,n, m € Z, with n > 1 and m > 0, prove that
if a= b (mod n) and m|n, then a = b (mod m). if c=d(modn), then mc =md (modn) and c” =a”
(mod n).
6. Let m,n € Z* with gcd(m, n) = 1 and let a, b € Z.
b) If AnXn—1 ° 1 Xj XQ = Xy + 10° +---4+x,- 10+ xo de-
Prove that a == b (mod m) and a = b (mod n) if and only if
a = 6b (mod mn). notes an (n + 1)-digit integer, then prove that
7. Provide a counterexample to show that the result in the XnXp—1 0 XyXQ HXq FXn-1 +--+ +41 + Xo (mod 9).
preceding exercise is false if ged(m, n) > 1. 19. a) Prove that for all nm € N, 10” = (—1)" (mod 11).
8. Prove that for all integers n exactly one of n, 2n — 1, and b) Consider the result for mod 9 in part (b) of Exercise 18.
2n + 1 is divisible by 3. State and prove a comparable result for mod 11.
9. Ifn € Z* and n > 2 prove that 20. For p aprime determine all elements a € Z, where a? = a.
n-1 21. For a, b,n€ Z* and n > 1, prove that a =b (modn) >
Q (modn), n odd
, i= ged(a, n) = ged(b, vn).
5 (modn), n even.
i=]
22. a) Show that for all [a] € Zy, if [a] # [O], then
10. Complete the proofs of Theorems 14.11 and 14.12.
[a]° = [1].
11. Define relation R on Z* by a Vb, if t(a) = t(b), where
t(a) = the number of positive (integer) divisors of a. For ex- b) Letn € Z* with gcd(n, 7) = 1. Prove that
ample, 2& 3 and 4 & 25 but 5 RY. 7\(n° — 1).
a) Verify that A is an equivalence relation on Zr. 23. Use the Caesar cipher to encrypt the plaintext: “All Gaul is
b) For the equivalence classes [a] and [b] induced by &%, divided into three parts.”
define operations of addition and multiplication by [a] + 24, The ciphertext FT Q1MKIQIQDQ was encrypted us-
[b] = [a + b} and [a}|b] = [ab]. Are these operations well- ing the encryption function E: Zo5 — Zo where E(@) =
defined [that is, deesaRe, bRd> (a+b R(c+a), (6 + x) mod 26. Considering the frequencies of occurrence for
(ab) R (cd)}? the letters in the ciphertext, determine (a) the key « for this
14.4 Ring Homomorphisms and lsomorphisms 697
cipher shift; (b) the decryption function D; and (c) the original 35. For the hashing function at the end of Example 14.17, find
(plaintext) message. (a) h(123-04-2275); (b) a social security number # such that
25. Determine the total number of affine ciphers for an alphabet h(n) = 413, thus causing a collision with the number 081-37-
of (a) 24 letters; (b) 25 letters; (c) 27 letters; and (d) 30 letters. 6495 of the example.
26. The ciphertext 36. Write a computer program (or develop an algorithm) that
implements the hashing function of Exercise 35.
RWIWQTOOMYHKUXGOEMYP
37. The parking lot for a local restaurant has 41 parking spaces,
was encrypted with an affine cipher. Given that the plaintext
numbered consecutively from 0 to 40. Upon driving into this
letters e, f are encrypted as the ciphertext letters W, X, respec-
lot, a patron is assigned a parking space by the parking atten-
tively, determine (a) the encryption function E; (b) the decryp-
dant who uses the hashing function A(k) = k mod 41, where
tion function D; and (c) the original (plaintext) message.
k is the integer obtained from the last three digits on the pa-
27. (a) How many distinct terms does the linear congruential tron’s license plate. Further, to avoid a collision (where an oc-
generator with a = 5, c = 3, m = 19, and x9 = 10, produce?
cupied space might be assigned), when such a situation arises,
(b) What is the sequence of pseudorandom members generated? the patron is directed to park in the next (consecutive) available
28. Given the modulus m and the two seeds xy, x), with 0 < space — where 0 is assumed to follow 40.
Xq, X; < m,asequence of pseudorandom numbers can be gener-
a) Suppose that eight automobiles arrive as the restaurant
ated recursively from x, = (%,-; + X,-2) mod m, n > 2. This
opens. If the last three digits in the license plates for these
generator is called the Fibonacci generator.
eight patrons (in their order of arrival) are
Find the first ten pseudorandom numbers generated when
m = 37 and x9 = 1, x, = 28. 206, 807, 137, 444, 617, 330, 465, 905,
29. Let x4; = (ax, +c) mod m, where 2<a<m,0<c < respectively, which spaces are assigned to the drivers of
m,O0<XxX) <m,0<x,,) <m,andn > 0. Prove that these eight automobiles by the parking attendant?
X, = (a"xy9 + cl(a" — 1I)/(a — 1)]) mod m, 0 <x, <m. b) Following the arrival of the eight patrons in part (a), and
30. Consider the linear congruential generator with a = 7,
before any of the eight could leave, a ninth patron arrives
c=4,and m = 9. If x4 = 1, determine the seed xp. with a license plate where the last three digits are OOx. If
this patron is assigned to space 5, what is (are) the possible
31. Prove that the sum of the cubes of three consecutive integers
value(s) of x?
is divisible by 9.
38. Solve the following linear congruences for x.
32. Determine the last digit in 3°.
a) 3x =7 (mod 31) b) 5x = 8 (mod 37)
33. For m,n, r € Z*, let p(m, n, r) count the number of par-
titions of m into at most # (positive) summands each no larger c) 6x = 97 (mod 125)
than r. Evaluate an pik(n + l),ny,n)ne Ze.
34, Given a ring (R, +, +), an element r € R is called idempo-
tent whenr? = r.Ifn € Z* withn > 2, prove that ifk € Z, and
k is idempotent, then n — k + 1 is idempotent.
14.4
Ring Homomorphisms and Isomorphisms
In this final section we shall examine functions (between rings) that obey special properties
which depend on the closed binary operations in the rings.
|EXAMPLE 14.19 Consider the rings (Z, +, -) and (Z,., +, -), where addition and multiplication in Z, are as
defined in Section 14.3.
Define f: Z— Ze by f(x) = [x]. For example, f(1) = [1] = [7] = f(7) and f(2) =
f(8) = f(2 + 6k) = [2], for all k € Z. (So f is onto though not one-to-one.)
For 2, 3 € Z, f (2) = [2],
f 3) = [3] and we have f (2 + 3) = f(5) = [5] = [2] + [3] =
f(2) + f(3), and f (2-3) = f(6) = [0] = [2J[3] = f(2)- fF).
698 Chapter 14 Rings and Modular Arithmetic
In fact, for all x, y € Z,
fiat+y)=[x+y]l=O14+01= f@)
+ fo),
t t
Addition in Z Addition in Z,
and
fy) = ley] = [Ixlly] = fQ)- £Q).
t t
Multiplication in Z Multiplication in Z,
This example suggests the following definition.
Definition 14.8 Let (R, +, -) and(S, @, ©) be rings. A function f: R — S is called a ring homomorphism
if for alla, be R,
a) f(a +b) = f(a) ® f(b), and
b) f(a-b) = f(a) © f(b).
When the function f is onto we say that S is a homomorphic image of R.
This function is said to preserve the ring operations for the following reasons: Consider
f(at+b) = f(a) ® f(b). Adding a, b in R first and then finding the image (under f) in
S of this sum, we get the same result as when we first determine the images (under /f) in
S of a, b, and then add these images in S. (Hence we have the function operation and the
additive operations commuting with each other.) Similar remarks can be made about the
multiplicative operations in the rings.
For the rings Z4 and Zg, define the function f: Z4 > Zs by f ({a]) = la]? (= [a7]). Then
for all [a], [b] € Z4, we have
f((al{b) = f (abl) = [abe = (al(b)? = faPtbP = f(a) fue.
t
Multiplication in Z,
1 Multiplication in Z,
Consequently, this function f preserves the multiplicative operations in the rings. However,
for (1], [2] € Z4, we find that f ([1] + [2]) = f(13)) = [3 = [1], while f ((1]) + f([2]) =
[1]? + (2)? = [114+ [41 = 5] (# [1] in Ze). So f does not preserve the additive operations
in the rings —hence, f is not a ring homomorphism.
The function g: Z4 > Zg, defined by g([a]) = 3[a], preserves the additive operations,
but not the multiplicative operations, in the rings.
Definition 14.9 Let f: (R, +, +) > (S, ®, ©) be a ring homomorphism. If f is one-to-one and onto, then
f is called a ring isomorphism and we say that R and S are isomorphic rings.
We can think of isomorphic rings arising when the “same” ring is dealt with in two dif-
ferent languages. The function f then provides a dictionary for unambiguously translating
from one language into the other.
The terms“homomorphism” and “isomorphism” come from the Greek, where morphe
refers to shape or structure, homo means similar, and iso means identical or same. Hence
homomorphic rings (that is, rings where one is a homomorphic image of the other) may
14.4 Ring Homomorphisms and Isomorphisms 699
be thought of as similar in structure, while isomorphic rings are (abstractly) replicas of the
same structure.
In Definition 11.13 we defined the concept of graph isomorphism. There we called the
undirected graphs G; = (V;, £,) and Gz = (V2, Ez) isomorphic when we could find a
function f: V; — V2 such that
a) f is one-to-one and onto, and
b) {a, b} € E; ifand only if {f (a), f(b)} € E>.
In light of our statements about ring isomorphisms, another way to think about condition (b)
here is in terms of the function f preserving the structures of the undirected graphs G, and
G2. When |V,| = | V3}, it is not difficult to find a function f: V; — V> that is one-to-one and
onto. However, for a given set V of vertices, what determines the structure of an undirected
graph G = (V, E) is its set of edges (where the vertex adjacencies are defined). Therefore
a one-to-one correspondence f: V; — V2 is a graph isomorphism when it preserves the
structures of G; and G2 by preserving these vertex adjacencies.
For the ring R in Example 14.5 and the ring Zs, the function f: R > Zs given by
EXAMPLE 14.20
f(a) = [0], f(b) = (1), fc) = [2], F(d) = [3], fle) = [4]
provides us with a ring isomorphism.
For example, f(c + d) = f(a) = [0] = [2] + [3] = f(c) + f(d), while f(be) = fe)
= [4] = [1][4] = f()) f(e). Un the absence of other methods and theorems, there are 25
such equalities that must be verified for the preservation of each of the binary operations.)
Inasmuch as there are 5! = 120 one-to-one functions from R onto Zs, is there any as-
sistance we can call upon in attempting to determine when one of these functions is an
isomorphism? Suggested by Example 14.20, the following theorem provides ways of at
least starting to determine when functions between rings can be homomorphisms and iso-
morphisms. [Parts (c) and (d) of this theorem rely on the results of Exercises 20 and 21 in
Section 14.2.]
THEOREM 14.15 If f: (R, +, -) > (S, @, ©) is a ring homomorphism, then
a) f (Zr) = Zs, where zr, Zs are the zero elements of R, S, respectively:
b) f(—a) = —f (a), for alla € R;
c) f(na) = nf (a), foralla eR, ne Z;
d) f(a") =[f(a)]", forallae R, ne Zt; and
e) if A is a subring of R, it follows that f(A) is a subring of S.
Proof:
a) zs @ f (zr) = f(zr) = flere +Zr) = f(zr) ® f(zr). (Why?) So by the cancella-
tion law of addition in S$, we have f (zr) = Zs.
b) zs = f(zr) = f(a + (—a)) = f(a) @ f (—a). Since additive inverses in S are unique
and f(—a) is an additive inverse of f(a), it follows that f(—a) = — f(a).
700 Chapter 14 Rings and Modular Arithmetic
c) If n =0, then f(na) = f(zr) = zs = nf(a). The result is also true for n = 1, so
we assume the truth for n = k (> 1). Proceeding by mathematical induction, we
examine the case where n = k + 1. By the results of Exercise 20 of Section 14.2,
we get f((k + la) = f(ka +a) = f(ka) @ f(a) = kf (a) @ f(a) (Why?) =
(k + 1)(f (a)) (Why?). (Note: There are three different kinds of addition here.)
When n > 0, f(—na) = —nf (a). This follows from our prior proof by induc-
tion, part (b) of this proof, and part (b) of Theorem 14.1, because f(—na) + f (na) =
f(n(—a)) + f (na) = nf (—a) + nf (a) = nf f(—a) + f(a) = nl— f(a) + fla)l=
nzs = Zs. Hence the result follows for all n € Z.
d) We leave this result for the reader to prove.
e) Since A #¥, f(A) AO. If x, y € f(A), thenx = f(a), y = f(b) for somea, DEA.
Then x ® y= f(a) @ f(b) = f(a +b), and x Oy = f(a) O f(b) = flab), with
a+b,abeA(Why?),sox @y,x Oy € f(A). Also, ifx € f(A) thenx = f(a) for
some a € A. So we have f(—a) = — f(a) = —x, and because —a € A (Why?), we
have —x € f(A). Therefore f(A) is a subring of S.
When the homomorphism is onto, we obtain the following theorem.
THEOREM 14.16 If f: (R, +, -) > (S. ®, ©) is aring homomorphism from R onto S, where |S| > 1, then
a) if R has unity wr, then f (up) is the unity of S;
b) if R has unity up anda isaunit in R, then f(a) is aunit in S and f(a~!) = [f(a)]';
c) if R is commutative, then S is commutative; and
d) if 7 is an ideal of R, then f(/) is an ideal of S.
Proof: We shall prove part (d) and leave the other parts to the reader. Since / is a subring of
R, it follows that f(/) is a subring of S$ by part (e) of Theorem 14.15. To verify that f(/)
is an ideal, letx € f(/) ands € S. Thenx = f(a) ands = f(r), forsomea € /,r € R. So
SOx = f(r) © f(a) = f(ra), withra € 1,andwehaves © x € f (/). Similarly,x Os €
f (1), so fC) is an ideal of S.
These theorems reinforce the way in which homomorphisms and isomorphisms preserve
structure. But can we find any use for these functions, aside from using them to prove more
theorems? To help answer this, we start by considering the following example.
Extending the idea developed in Exercise 18 of Section 14.2, let R be the ring Z2 X Z3 X Zs.
EXAMPLE 14.21
Then |R| = |Zo} - |Zs| - |Z5| = 30, and the operations of addition and multiplication are
defined in R as follows:
For all (a1, a2. a3), (b1, bo, b3) © R where ay, b) € Zo, ao, bo € Zs, and az, b3 € Zs,
(a;, 42, a3) + (by, bo, b3) = (ay + by, ao + bo, a3 + b3)
t t t t
Addition Addition Addition Addition
inR in Zy in Zs in Z;
14.4. Ring Homomorphisms and lsomorphisms 701
and
(4), G2, 43) - (by, b2, b3) = (ay - by, a2 - bz, a3 + b3).
Multiplication Multiplication Multiplication Multiplication
inR in Z, inZ; in Z,
Define the function f: Z39 > R by f (x) = (41, x2, x3), where
x; = x mod 2
xX. = x mod 3
x3 =x mod 5.
In other words, x1, x2, and x3 are the remainders that result when x is divided by 2, 3, and
5, respectively.
The results in Table 14.11 show that f is a function that is one-to-one and onto.
Table 14.11
x (in Z3o) | f(x) Gn R) | x Gin Z39) | f(x) Gn R) | x (in Z30) | f(x) Gin R)
0 (O, 0, 0) 10 (0, 1, 0) 20 (0, 2, 0)
1 (1, 1, 1) 11 (1, 2, 1) 21 (1, 0, 1)
2 (0, 2, 2) 12 (0, 0, 2) 22 (O, 1, 2)
3 (1, 0, 3) 13 (1, 1, 3) 23 (1, 2, 3)
4 (0, 1, 4) 14 (O, 2, 4) 24 (O, 0, 4)
5 (1,2, 0) 15 (1, 0, 0) 25 (1, 1, 0)
6 (0, 0, 1) 16 (0, 1, 1) 26 (0, 2, 1)
7 (1, 1, 2) 17 (1, 2, 2) 27 (1, 0, 2)
8 (0, 2, 3) 18 (0, 0, 3) 28 (0, 1, 3)
9 (1, 0, 4) 19 (1, 1, 4) 29 (1, 2, 4)
To verify that f is an isomorphism, let x, y € Z3o. Then
f(x+y) = ((* + y) mod 2, (x + y) mod 3, (x + y) mod 5)
= (x mod 2, x mod 3, x mod 5) + (y mod 2, y mod 3, y mod 5)
= fix) + fo),
and
fy) = (xy mod 2, xy mod 3, xy mod 5)
= (x mod 2, x mod 3, x mod 5) - (y mod 2, y mod 3, y mod 5)
= f(xXf),
so f is an isomorphism.
In examining Table 14.11 we find, for example, that
1) f (0) = (0, 0, 0), where O is the zero element of Z39 and (0, 0, 0) is the zero element
of Z> x Z3 x Zs.
2) f(2+4) = f(6) = (, 0, 1) = ©, 2, 2) + (0, 1, 4) = f(2) + FA).
3) The element 21 is the additive inverse of9 in Z39, whereas f(21) = (1, 0, 1) is the
additive inverse of (1, 0, 4) = f(9) in Zo X Zs X Zs.
702 Chapter 14 Rings and Modular Arithmetic
4) {0, 5, 10, 15, 20, 25} is a subring of Z39 with {(0, 0, 0) (= f(0)), C1, 2, 0) (= f(5)),
(0, 1, 0) (= f(10)), C1, 0, 0) (= F(15)), (, 2, 0) (= f(20)), C1, 1, 0) (= f(25))}
the corresponding subring in Zz X Z3 X Zs.
But what else can we do with this isomorphism between Z3q and Z2 X Z3 X Zs? Sup-
pose, for example, that we need to calculate 28 - 17 in Z39. We can transfer the problem to
Z2. X Z; X Z; and compute f(28)- f(17) = (0, 1,3)- (1,2, 2), where the moduli
2, 3, and 5 are smaller than 30 and easier to work with. Since (0, 1, 3)- (1, 2,2) =
(0-1, 1-2,3-2) = (0, 2, 1) and f~'(0, 2, 1) = 26, it follows that 28- 17 (in Z39) is 26.
In Example 14.21 we see that if we are given an element (x), x2, x3) in Z2 X Z3 X Zs,
then we can use Table 14.11 to find the unique element x in Z39 so that f (x) = (41, x2, X3).
But what would we do if we did not have such a table — especially, if we found our-
selves working with larger rings, such as Z32736 and Z3; X Z32 X Z33, and the isomorphism
g: £32736 > Zs, X Z32 X Za3 where g(x) = (x mod 31, x mod 32, x mod 33) for x €
232736? The following result provides a technique for determining the unique preimage for
a given element of the codomain for such an isomorphism g.
THEOREM 14.17 The Chinese Remainder Theorem. Let m,,m2,...,m, € Zt — {1} with k > 2, and with
gcd(m;, m;) = 1 for all 1 <i < 7 <k. Then the system of& congruences
xX =a, (modm),)
xX =a> (mod m2)
x =a, (mod m,)
has a simultaneous solution. Further, any two such solutions of the system are congruent
modulo m,m-++ my.
Proof: We start by showing how to construct a simultaneous solution of the system of k
congruences.
Let m = myjm2z-+--m, and, for 1 <j <k, let M; =m/m;. [So, for example, M, =
m2m3m,4 + - mand M2 = mjm3m4--- m,.] We findthat forall 1 < 7 <k,gcd(m,;, Mj) =
1. If not, then for some (fixed) j, with 1 < j <k, there exists a prime p such that p\m,
and p|M;. But from Lemma 4.3 it follows that if p|M; then p\m; for some 1 <i <k,
where i # j. Consequently, we find that p|m; and p|m; for i # j, and this contradicts
gcd(m;, mj) = 1.
Foreach 1 < j < k, gcd(m,;, M;) = 1. Consequently, from Theorem 14.14 we know that
M; isaunitin Z,,. So there exists x; € Z,,, such that M;x; = 1 (mod m;). Now consider
the sum
x= ayMyx, + a2M x2 tee arMiXr.
We claim that x is a simultaneous solution of the system of k congruences. Note that for
1<j<kand1<i<k,ifi # j then M; =O (modm;) because m;|M;. Hence M;x; =
0 (mod m;). Since M;x; = 1 (mod m,;) we find that
x =a;M;x; =a; (modm,),
foreach 1 <j <k.
14.4 Ring Homomorphisms and Isomorphisms 703
Now suppose that x, y are both simultaneous solutions of the system of k congru-
ences. Then x = y (mod m,) for all 1 < 7 <k. Consider the prime factorization of m =
m \mz--+-m,. Let p be a prime such that p’|m but p’t' / m, for some t € Z'. Since
ged(m,, m,) = 1 foralll <i < j <k, it follows that p'|m, for one (and only one) modulus
m,. Consequently, we see that p’|(x — y), and so it follows from the Fundamental Theorem
of Arithmetic that m|(x — y), or x = y (modm).
Now let us see how one can apply the Chinese Remainder Theorem.
In Marjorie’s fourth-grade arithmetic class, three students —— namely, Megan, Avery, and
EXAMPLE 14.22
Elizabeth — enjoy doing long-division problems (without a calculator). So Marjorie selects
a positive integer m and asks for the remainder upon division by three different divisors.
Upon dividing by 31 Megan learns that the remainder is 14. Avery divides n by 32 and finds
the remainder is 16. Meanwhile, Elizabeth obtains the remainder of 18 when she divides n
by 33. What is the smallest value of n that Marjorie could have selected?
Here we seek a simultaneous solution for the three congruences
x = 14 (mod 31), x = 16 (mod 32), x = 18 (mod 33).
So a, = 14, a2 = 16, a3 = 18, my, = 31, mo = 32, WR = 33, and m = N|M MA = 32736.
Further, M; = m/m, = 1056, M2 = m/m2 = 1023, and M3; = m/m; = 992. Using the
Euclidean algorithm (when necessary), as in Example 14.13, we learn that
[xi] = [Mi]! = [1056]! = (3431)
+ 277! = [2]| = [16] in Z,,, = Zs,
[x2] = [Mo]! = [1023]~' = (31332) +31]-' = [3177' = [31] in Z,,, = Zo, and
[x3] = [M3]~! = [992]-! = [30(33) + 2]7! = [2]! = [17] in Z,,, = Za3.
Hence,
x = (14)(1056)(16) + (16)(1023)(31) + (18) (992) (17) (mod 32736)
= 1047504 (mod 32736)
= 31(32736) + 32688 (mod 32736)
= 32688 (mod 32736).
So the (smallest) positive integer n that Marjorie could have selected is 32688.
(As acheck we find that 32688 = 1054(31) + 14 = 1021(32) + 16 = 990(33) + 18, so
x satisfies the given system of three congruences and is the smallest positive integer that
does so.)
Now if we look back at the isomorphism g: Z32736 —> Zs, * Zs32 * Zs3 (that we men-
tioned prior to stating the Chinese Remainder Theorem) we see that for the codomain
element (14, 16, 18) in Z3; X Z32 X Zs3, the element 32688 in the domain Z32736 is the
(unique) preimage. That is, (32688) = (14, 16, 18) and for any other integer y, if g(y) =
(14, 16, 18), then y = 32688 (mod 32736) — so 32688 is the only solution in {0, 1, 2, 3,
..., 32735}.
704 Chapter 14 Rings and Modular Arithmetic
The isomorphisms f (of Example 14.21) and g (of Example 14.22) are special cases ofa
more general result’ that we shall now state. Ifn = njn2---ny, wheren; > 1foralll <i <
k and ged(n;, nj) = 1 for all 1 <i <j <k, then the rings Z, and Z,, X Zn, X+** X Zy,
are isomorphic. In particular, we know from the Fundamental Theorem of Arithmetic that
foreachn € Z* — {1}, we can factor nas pj’ p;’- > + p;', where pi, 2, -.-. p; aret distinct
primes, f > 1, ande), e2,..., e, € Z*. It then follows that the rings Z, and Z,,, X Zm, X
-++ X Zin, are isomorphic form, = py’, m2 = py, ..., mM, = py.
As aresult of this isomorphism, arithmetic involving large integers (that exceed the word
size of a given computer) can be performed using the smaller different moduli. Further,
the computation for these smaller moduli can be carried out in parallel— thus, reducing
computation time. [For more on the Chinese Remainder Theorem in conjunction with
applications of residue arithmetic in computers, we direct the interested reader to pages
146-149 of the text by K. H. Rosen [12], pages 344-359 of the text by J. P. Tremblay and
R. Manohar [14], as well as the text by D. E. Knuth [8].
9. a) How many units are there in the ring Zs?
EXERCISES 14.4
b) How many units are there in the ring Z. X Z) X Z,?
1. If R is the ring of Example 14.6, construct an isomorphism c) Are Zs and Z, X Z, X Z, isomorphic rings?
f:R-> TZ. 10. a) How many units are there in Z,;? How many in
2. Complete the proofs of Theorems 14.15 and 14.16. Z; X Zs?
3. If R, S, and T are rings and f: R-> S, g:S—-T are b) Are Z; and Z; X Zs isomorphic?
ring homomorphisms, prove that the composite function g o f:
csff
11. Are Z, and the ring in Example 14.4 isomorphic?
R -» T is aring homomorphism.
12. If f: R + S isa ring homomorphism and J is an ideal of
aeéR}, then S is a ring under matrix S, prove that f~'(J) = {a € R|
f (a) € J} is an ideal of R.
addition and multiplication. Prove that R is isomorphic to S. 13. Find a simultaneous solution for the system of two con-
gruences:
5. a) Let (R, +, -) and (S, @, ©) be rings with zero elements
ze and zs, respectively. If f: R > S is a ring homomor- x =5 (mod 8)
phism, let K = {a € R| f(a) = Zs}. Prove that K is an ideal x = 73 (mod 81).
of R. (K is called the kernel of the homomorphism f.)
14. A band of 17 pirates captures a treasure chest full of (identi-
b) Find the kernel of the homomorphism in Example 14.19. cal) gold coins. When the coins are divided up into equal num-
c) Let f, (R, 4+, +), and (S, 6, ©) be as in part (a). Prove bers, three coins remain. One pirate accuses the distributor of
that f is one-to-one if and only if the kernel of f is {zx}. miscounting and kills him in a duel. As a result, the second
6. Use the information in Table 14.11 to compute each of the time the coins are distributed, in equal numbers, among the 16
following in Za. surviving pirates, there are 10 coins remaining. An argument
erupts and leads to gun play, resulting in the demise of another
a) (13)(23)
+ 18 b) (11)(21)
— 20
pirate. Now when the coins are divided up, in 15 equal piles,
ce) (13 + 19)(27) d) (13)(29) + (24)(8) there are no remaining coins. What is the smallest number of
7, a) Construct a table (as in Example 14.21) for the isomor- coins that could have been in the chest?
phism f: Zo) > Za X Zs. 15. Find a simultaneous solution for the system of four con-
b) Use the table from part (a) to compute the following gruences:
in Z 9. x = 1 (mod 2)
i) (17)(19) + (12)(14)
x =2 (mod 3)
ii) (18)(11) — (9)(15)
x =3 (mod 5)
8. Letn, r,s € Zt withn, r,s >2,n=rs, and ged(r, s) =
l. If f:Z, > Z, X Z, is a ring isomorphism with f(a) = x =5 (mod 7).
(1,0) and f(6) = (0, 1), prove that if (m, 1) € Z, X Z,, then
f-'(m, t) = ma + th (mod n).
"In some textbooks this result is referred to as the Chinese Remainder Theorem.
14.5 Summary and Historical Review 705
14.5
Summary and Historical Review
Emphasizing structure induced by two closed binary operations, this chapter has introduced
us to the mathematical system called a ring. Throughout the development of mathematics,
the ring of integers has played a key role. In the branch of mathematics called number
theory, we examine the basic properties of (Z, +, -), as well as the finite rings (Z,, +, «).
The matrix rings provide familiar examples of noncommutative rings.
Pierre de Fermat (1601-1665) Sophie Germain (1776-1831)
This chapter contains the development of an abstract theory. On the basis of the definition
of a ring, we established principles of elementary algebra that we have been using since
our early encounters with arithmetic, signed numbers, and the manipulation of unknowns.
The reader may have found some of the proofs tedious, as we justified all the steps in the
derivations. Faced with the challenge of trying to prove a result in abstract mathematics,
one should follow the advice given by the Roman rhetorician Marcus Fabius Quintilianus
(first century A.D.), when he said, “One should not aim at being possible to understand (or
follow), but at being impossible to be misunderstood.”
A famous problem in number theory, known as Fermat’s Last Theorem, claims that
the equation x" + y” =z", ne Z*, n > I, has no solutions in Z* when n > 2. In 1637
the French mathematician Pierre de Fermat (1601—1665) wrote that he had proved this
result but that the proof was too long to be included in the margin of his manuscript.
Many renowned mathematicians of the eighteenth and nineteenth centuries tried to prove
this result— among them Leonhard Euler (1707-1783), Peter Gustav Lejeune Dirichlet
(1805-1859), Carl Friedrich Gauss (1777-1855), Sophie Germain (1776-1831), Adrien-
Marie Legendre (1752-1833), Niels Henrik Abel (1802-1829), Gabriel Lamé (1795-1870),
and Leopold Kronecker (1823-1891). Although unsuccessful, attempts to resolve Fermat’s
claim did result in new mathematical ideas and theories. The twentieth century also produced
scholars who expended tremendous efforts on this problem. One such scholar was born in
Cambridge, England, in 1953. There, at the age of 10, he went to the public library in his
town and looked into a book on mathematics. As he read about Fermat’s Last Theorem,
it seemed so simple — and he wanted to prove it. In the 1970s Andrew Wiles went to
Cambridge University, and after he finished his degree, he became a research student there,
706 Chapter 14 Rings and Modular Arithmetic
working in number theory —in an area called Iwasawa theory. For at this time Fermat's
Last Theorem was not in fashion. When Wiles completed his doctorate, he moved to the
United States, to a position at Princeton University. In the 1980s his enthusiasm for his
childhood dream was rekindled and he spent close to seven years working alone— locked
up in his attic office. He finally confided in his colleague Nick Katz — in January 1993. Then
in June 1993 Professor Wiles returned to Cambridge to deliver a series of three lectures
at a number-theory conference. The last lecture ended in grand applause, accompanied by
flashing cameras and reporters’ questions. It appeared that he had solved Fermat’s Last
Theorem. Unfortunately, when his 200-page write-up was peer-reviewed, by experts such
as Nick Katz, problems started to arise, and a hole in the proof caused everything to collapse
like a house of cards. The fall of 1993 found Wiles back at Princeton — now crestfallen,
angry, and humiliated. But then, after renewed effort, on September 19, 1994, he took one
last look at his proposed proof. The next morning he wrote up a new proof, as everything
fell into place. This time no one could find any flaws. The May 1995 issue of the journal
Annals of Mathematics contains the original Cambridge paper by Andrew Wiles and the
correction by Wiles and his friend and former student Richard Taylor. At last Fermat’s Last
Theorem was laid to rest. (Although Wiles gets much of the praise, other mathematicians
deserve accolades as well — among them, Kenneth Ribet, Barry Mazur, Goro Shimura,
Yutaka Taniyama, Gerhard Frey, Matthias Flach, and Richard Taylor.) For more on the
history and development of the proof of this famous theorem, the reader is directed to the
very readable account given by A. D. Aczel [1].
Andrew John Wiles (1953- }
AP/Wide World Photos
In trying to prove Fermat’s Last Theorem, the German mathematician Ernst Kummer
(1810-1893) developed the foundations for the concept of the ideal. This concept was later
formulated, named, and utilized by his countryman Richard Dedekind (1831—1916) in his
research on what are now called Dedekind domains. Use of the term “ring,” however, seems
to be attributable to the German mathematician David Hilbert (1862-1943).
Ring homomorphisms and their interplay with ideals were extensively developed by
the German mathematician Emmy Noether (1882-1935). This great genius received little
remuneration, financial or otherwise, from the governing bodies of her native land because
14.5 Summary and Historical Review 707
of the sexual bias that was prevalent in the universities at that time. Emmy Noether’s talents
were nonetheless recognized by her colleagues, and she was eulogized in the New York
Times on May 3, 1935, by Albert Einstein (1879-1955), who acknowledged the influence
and importance of her work for the development of relativity theory. In addition to enduring
sexual bias, as a Jew she was forced to flee her homeland in 1933, when the Nazis came to
power. She spent the last two years of her life guiding young mathematicians in the United
States. For more on the life of this fascinating person, examine the biography by A. Dick
[4] and the article by C. Kimberling [7].
The special rings called fie/ds arise in the rational, real, and complex number systems.
But we also saw some interesting finite fields. These structures will be examined again
in Chapter 17 in connection with combinatorial designs. The field theory developed by
the French genius Evariste Galois (1811-1832) answered questions about the solutions
of polynomial equations of degree > 4. These questions had baffled mathematicians for
centuries, and his ideas, now known as Galois theory, still comprise one of the most ele-
gant mathematical theories ever developed. More on Galois theory appears in the text by
O. Zariski and P. Samuel [16].
Emmy Noether (1882-1935)
For supplemental reading on ring theory at the introductory level, the interested reader
should examine Chapters 12-18 of J. A. Gallian [5], Chapter 6 of V. H. Larney [9], and
Chapters 6, 7, and 12 of N. H. McCoy and T. R. Berger [10]. A somewhat more advanced
coverage can be found in Chapter 4 of the text by E. A. Walker [15].
The development of modular congruence, along with many related ideas, we owe pri-
marily to Carl Friedrich Gauss. Problems involving systems of congruences date back to the
late first century where they appear in the work of the Greek mathematician Nicomachus
of Gerasa. Systems of two congruences can also be found in the writings of the seventh-
century mathematician Brahmagupta (born in 1598 in northwestern India). However, it was
not until 1247 that we find the publication of a general method for solving systems of linear
congruences. In his Shushu jiuzhang (Mathematical Treatise in Nine Sections), the method
now called the Chinese Remainder Theorem is presented by the Chinese mathematician
Qin Jiushao (c. 1202-1261). Born in the province of Sichuan during the time of Genghis
Khan, this remarkable mathematical talent was also an accomplished architect, musician,
and poet, as well as being quite the sportsman
— in archery, fencing, and horsemanship.
708 Chapter 14 Rings and Modular Arithmetic
More on the solution of congruences and the Chinese Remainder Theorem can be found in
the texts by I. Niven, H. S. Zuckerman, and H. L. Montgomery [11] and K. H. Rosen [12].
As mentioned earlier (in the footnote in Example 14.16), more on the history, develop-
ment, and applications of cryptology can be found in the texts by T. H. Barr [3], P. Garrett
[6], and W. Trappe and L. C. Washington [13].
Finally, the topic of hashing, or scattering, can be further investigated in Chapter 2 of
J. P. Tremblay and R. Manohar [14]. Chapter 4 of A. V. Aho, J. E. Hopcroft, and J. D.
Ullman [2] includes a discussion on the efficiency of hashing functions and a probabilistic
investigation of the collision problem that arises for these functions.
REFERENCES
1. Aczel, Amir D. Fermat's Last Theorem: Unlocking the Secret of an Ancient Mathematical
Problem. New York: Four Walls Eight Windows, 1996.
. Aho, Alfred V., Hopcroft, John E., and Ullman, Jeffrey D. Data Structures and Algorithms.
Reading, Mass.: Addison-Wesley, 1983.
. Barr, Thomas H. /nvitation to Cryptology. Upper Saddle River, N.J.: Prentice-Hall, 2002.
W
. Dick, Auguste. Emmy Noether (7882—1935), trans. Heidi Blocher. Boston: Birkhauser- Boston,
1981.
. Gallian, Joseph A. Contemporary Abstract Algebra, 5th ed. Boston: Houghton Mifflin, 2002.
. Garrett, Paul. Making, Breaking Codes; An Introduction to Cryptology. Upper Saddle River,
N.J.: Prentice-Hall, 2001.
. Kimberling, Clark. “Emmy Noether, Greatest Woman Mathematician.” Mathematics Teacher
(March 1982): pp. 246-249.
. Knuth, Donald Ervin, The Art of Computer Programming, 3rd ed., Volume 2, Semi-Numerical
Algorithms. Reading, Mass.: Addison-Wesley, 1997.
. Larney, Violet Hachmeister. Abstract Algebra: A First Course. Boston: Prindle, Weber &
Schmidt, 1975.
10. McCoy, Neal H., and Berger, Thomas R. Algebra: Groups, Rings and Other Topics. Boston:
Allyn and Bacon, 1977.
11. Niven, Ivan, Zuckerman, Herbert S., and Montgomery, Hugh L. An Introduction to the Theory
of Numbers, 5th ed. New York: Wiley, 1991.
12. Rosen, Kenneth H. Elementary Number Theory, 4th ed. Reading, Mass.: Addison-Wesley, 1999.
13. Trappe, Wade, and Washington, Lawrence C. introduction to Cryptography with Coding Theory.
Upper Saddle River, N.J.: Prentice-Hall, 2002.
. Tremblay, Jean-Paul, and Manohar, R. Discrete Mathematical Structures with Applications
to Computer Science. New York: McGraw-Hill, 1975.
. Walker, Elbert A. Introduction to Abstract Algebra. New York: Random House/Birkhéuser,
1987,
. Zariski, Oscar, and Samuel, Pierre. Commutative Algebra, Vol. 1. Princeton, N.J.: Van Nostrand,
1958.
SUPPLEMENTARY EXERCISES
c) If (R, +, +) is a ring with unity wr, and S is a subring
of R with unity us, then wp = us.
d) Every field is an integral domain.
e) Every subring of a field is a field.
1. Determine whether each of the following statements is true
or false. For each false statement give a counterexample. f) A field can have only two subrings.
a) If(R, +, -)isaring, and¥ # S$ C R with S closed under g) Every finite field has a prime number of elements.
+ and -, then S is a subring of R. h) The field (Q, +, +) has an infinite number of subrings.
b) If (R, +, -) is aring with unity, and S is a subring of R, 2. Prove that a ring R is commutative if and only if
then S$ has a unity. (a+b) =a*+2ab+b*, foralla, be R.
Supplementary Exercises 709
3. Aring R is called Boolean if a* =a for alla € R. If R is for some | <i <n, or there exist 1 <i < j <n such that
Boolean, prove that (a)a@ + a = 2a = z,foralla € R,;and(b)R Alig Fer + Xjy-1 + X;).
is commutative. 12. Consider the ring (Z*, @, ©) where addition and multipli-
4. With C the field of complex numbers and S the ring of cation are defined by (a, b,c) @ (d, e, f) = (a+d,b+e,
c+ f)and (a, b, c) © (d, e, f) = (ad, be, cf). (Here, for ex-
2 X 2 real matrices of the form E 2 define f: C > S by
ample, a + d and ad are computed by using the standard binary
operations of addition and multiplication in Z.) Let S be the sub-
fiat+bi)= E ? |. for + bi €C. Prove that f is a ring
set of Z? where S = {(a, b, c)|a = b +c}. Prove that S is not
isomorphism. a subring of (Z’, @, ©).
5. If (R, +, -) is a ring, prove that C = {r ¢ R\ar = ra, for 13. a) In how many ways can one select two positive inte-
alla € R} isa subring of R. (The subring C is called the center gers m,n, not necessarily distinct, so that 1 < m < 100,
of R.) 1 <n < 100 and the last digit of 7” + 3” is 8?
6. Given a finite field F, let M2(F) denote the set of all b) Answer part (a) for the case where 1 <m < 125, 1 <
2x2 matrices with entries from F. As in Example 14.2, n< 125.
(M2(F), +, +) becomes a noncommutative ring with unity.
c) If one randomly selects m, n [as in part (a)], what is the
a) Determine the number of elements in M2(F) if F is probability that 2 is now the last digit of 7” + 3”?
i) Z ii) Z iii) Z,, p aprime 14. Letn € Z* withn > 1.
[Sls
a b a) If n = 2k where k is an odd integer, prove that
b) As in Exercise 13 of Section 14.1, A=
M,(Z,) is a unit if and only if ad — be # z. This occurs if k* =k (moda).
the first row of A does not contain all zeros (that is, z’s) and b) If n = 4k for some k € Z*, prove that
the second row is not a multiple (by an element of Z,,) of
the first. Use this observation to determine the number of
(2k)* = 0 (mod n).
units in c) Prove that
i} Mz(Z2)_— iit) M2(Zs) iii) M2(Z,), p a prime n— ]
3= | 5 (mod n), for n even with . odd,
7, Given an integral domain (D, +, -) with zero element z, =] 0 (mod n), otherwise.
leta, b € Dwithab ¥ z. (a) Ifa? = b’ anda? = Bb, prove that
a=b.(b) Letm,n € Z* with gcd(m, n) = 1. Ifa” = b” and 15. Suppose that a, b, c € Z and 5|(a? + b* +c’). Prove that
a" = b", prove thata = Bb. 5la or 5) or SIc.
8. Let A = R*. Define 6 and © on A by a@b = ab, the 16. Write a computer program (or develop an algorithm) that
ordinary product of a, b; anda © b = a2", reverses the order of the digits in a given positive integer. For
a) Verify that (A, ®, ©) is a commutative ring with unity. example, the input 1374 should result in the output 4731.
b) Is this ring an integral domain or field? 17. Suppose thata, b, k € Z* witha — b = pj' py --- pi‘, for
Pi, P2,-.-, Py prime and €, é2,..., e, € Z*. For how many
9. Let R be a ring with ideals A and B. Define A+ B=
values ofn (> 1) is@ = 6 (mod n) true?
{a+ bla ¢ A, b€ B}. Prove that A + B is an ideal of R. (For
any ring R, the ideals of R form a poset under set inclu- 18. As the co-chairs of the Homecoming Parade Committee,
sion. IfA and B are ideals of R, with glb{A, B} = AM B and Jerina and Noor must organize the freshmen for a pregame
lub{A, B} = A+ B, the poset is a lattice.) presentation. When they arrange these students in rows of 8,
there are three students remaining. When rows of 11 are tried,
10. a) If p is a prime, prove that p divides (?), for all 0 <
four students remain. Finally, rows of 15 leave five students
k< p.
remaining. So the co-chairs use the rows of 15 and place the
b) Ifa, b € Z, prove that (a + b)? =a? + b? (mod p). remaining five students at the center (in positions 6-10) of the
11. Given nr positive integers x), %2,...,X,, not meces- first row. What is the smallest number of freshmen Jerina and
sarily distinct, prove that either n|(x; +x, +-:-+4+4%,), Noor are trying to organize?
15
Boolean Algebra
and Switching
Functions
gain we encounter an algebraic system in which the structure depends primarily on two
closed binary operations. Unlike the situation for rings, in dealing with Boolean algebras
we shall stress applications more than the abstract nature of the system. Nonetheless, we
shall carefully examine the structure of a Boolean algebra, and in our study we shall find
results that are quite different from those for rings. Among other things, a finite Boolean
algebra must have 2” elements, for some € Z*. Yet we know of at least one ring for each
m €Z*,m > 1—namely, the ring (Z,,, +, -).
In 1854 the English mathematician George Boole published his monumental work
An Investigation of the Laws of Thought. Within this work Boole created a system of
mathematical logic that he developed in terms of what is now called a Boolean algebra.
In 1938 Claude Elwood Shannon developed the algebra of switching functions and
showed how its structure was related to the ideas established by Boole. As a result of this
work, an example of abstract mathematics in the nineteenth century became an applied
mathematical discipline in the twentieth century.
15.1
Switching Functions: Disjunctive
and Conjunctive Normal Forms
An electric switch can be turned on (allowing the flow of current) or off (preventing the flow
of current). Similarly, in a transistor, current is either passing (conducting) or not passing
(nonconducting). These are two examples of two-state devices. (In Section 2.2 we saw how
the electric switch was related to the two-valued logic.)
In order to investigate such two-state devices, we abstract these notions of “true” and
“false,” “on” and “off,” as follows.
Let B = {0, 1}. We define addition, multiplication, and complements for the elements
of B by
ee O+0=0; OF1=14051+141° |
b) - 0-0=021-020-1; 1611 .
2 O=1; T=0, *
711
712 Chapter 15 Boolean Algebra and Switching Functions
A variable x is called a Boolean variable if x takes on only values in B. Consequently,
x +x =x and x? =x-x = xx =x for every Boolean variable x.
If x, y are Boolean variables, then
1) x + y = Oif and only
if x = y = 0, and
2) xy = lifand only ifx = y= 1.
If ne Zt, BY = {(by, bo, ..., by) |b; € {0, 1}, 1 <i <n}. A function f: B” > B is
called a Boolean, or switching, function of n variables. The n variables are emphasized
by writing f(x), X2,..., Xn), where each x;, for 1 <i <n, is a Boolean variable.
Letf: B’ > B,where f(x. y, z) = xy +z.' (Wewrite xy forx « y.) This Boolean function
EXAMPLE 15.1
is determined by evaluating f for each of the eight possible assignments to the variables x,
y, z, as Table 15.1 demonstrates.
Table 15.1
x] ylzixy}] fa y,z)=xv4+z
0;0/]0] 0 0
0;}O0/]1] 0 1
Oo; 1/0; 0 0
QO; 1/1 0 1
1};0]/0); 0 0
1} 0/1 0 1
1 1 | 0 1 1
1] 141 1 ]
Definition 15.1 For n € Zt, n > 2, let f, g: B” > B be two Boolean functions of the n Boolean vari-
ables x,, X2,..., X,. We say that f and g are equal and write f = g if the columns for
f. g lin their respective (function) tables] are exactly the same. [The tables show that
f(b), b2,..., bn) = g(b1, bo, ..., b,) for each of the 2” possible assignments of either0
or 1 to each of the nm Boolean variables x;, x2, .... Xn-]
Definition 15.2 If f: B® > B, then the complement of f, denoted f, is the Boolean function defined on
B" by
Sf (b1, Bayo. Bn) = F (Br, ba, «5 Bn):
If g: B’ > B, we define f + g, f -g: B" — B, the sum and product of f, g, respec-
tively, by
(f + g)(h1, b2,..., b,) = f(b, b2,..., bn)
+ g(b1, b2,..., bn)
and
(f - g)(di, bo, see y b,) = f(b, bo, sey by) - g(D,, bo, sey by).
TWhen dealing with Boolean variables multiplication is performed before addition. Hence xy + z represents
(xy) +z, not x(y +2).
15.1 Switching Functions: Disjunctive and Conjunctive Normal Forms 713
Ten laws—important consequences of these definitions—are summarized in
Table 15.2.
Table 15.2
) f=f =x Law of the Double
Complement
2) f+a= fs EF y¥ HTP DeMorgan’s Laws
fer f+ xy e+
3) f+es=e+f xbys ye Commutative
fg = sf. xy yx Lawa:
4) f+er+ h) = (Freyth xe tag=&+y)+z Associative Laws
2 Phy (fehl) = Gye }
5) ft eh Se as+h) x+yz=(e+y)x+2) Distributive Laws
fetahy= fet fh ky +2) = xy + xz |
6) f+f=f X+xX=xX idempotent Laws
if=f - KX =X
7) f+0= f/f x+O0=x% Identity Laws
f-l=f : x+l= x
8) f+frui - x+%=1 Inverse Laws
ff=0 xx¥ = 0
9 f+i=i x+1=1 Dominance Laws
| f+0=0 x0 =0
10) f +f ge f key =X Absorption Laws
flfee= Ff x+y) =x
As with the laws of logic (in Chapter 2) and the laws of set theory (in Chapter 3), the
properties shown in Table 15.2 are satisfied by all Boolean functions f, g, h: B” — B and
by all Boolean variables x, y, z. (We write fg for f + g.)
The symbol 0 denotes the constant Boolean function whose value is always 0, and 1 is
the function whose only value is 1. (Note: 0,1 ¢ B.)
Once again the idea of duality appears in properties 2—10. If s stands for a theorem about
the equality of Boolean functions, then s¢, the dual of s, is obtained by replacing in s all
occurrences of + (+) by - (+) and all occurrences of 0 (1) by 1 (0). By the principle of
duality (which we shall examine in Section 15.4) the statement s“ is also a theorem. The
same is true for a theorem dealing with the equality of Boolean variables, except here it is
the Boolean values 0 and 1 that are replaced, not the constant functions 0 and 1.
The principle of duality is handy for establishing property 5 of Table 15.2 for Boolean
functions and Boolean variables.
The Distributive Law of + over +. The last two columns of Table 15.3 show that f +
EXAMPLE 15.2 gh =(f +2) +h). We also see that x + yz = (x + y)(x +2) is a special case of this
property for the situation where f, g, h: B> — B, with f(x, y, z) =x, g(x, y, z) = y, and
h(x, y, z) = z. Hence no additional tables are needed to establish this property for Boolean
variables.
714 Chapter 15 Boolean Algebra and Switching Functions
Table 15.3
flgelh|eh| f+e | ft+h | fteh | (f+af +h)
0/0/0] 0 0 0 0 0
o|ol]1] 0 0 1 0 0
O}1/0/] 0 1 0 0 0
Ol1i1/1 1 1 1 1
1/0/0] 0 1 1 1
1/0/11! 0 | | 1 1
1/1/0! 0 1 1 1 1
1}a}i} 1 1 1
By the principle of duality, we obtain f(g +h) = fg + fh.
a) To establish the first absorption property for Boolean variables, instead of relying on
EXAMPLE 15.3
table construction we argue as follows:
Reasons
X+txy=xltxy Identity Law
x(1+y) Distributive Law of +» over +
= xl Dominance Law (and Commutative Law of +)
=x Identity Law
This result indicates that some of our laws can be derived from others. The question
then is which properties we must establish with tables so that we can derive the other
properties as we did here. We shall consider this later in Section 15.4 when we study
the structure of a Boolean algebra.
In the meantime, let us demonstrate how the results of Table 15.2 can be used to
simplify another Boolean expression.
b) Simplify the expression wx + xz + (y +2Z), where w, x, y, and z are Boolean vari-
ables.
7 Reasons
wx +xz+(y +z) =wx+4%4+2)+04+2) DeMorgan’s Law
=wxt+(x+z)+(04+2) Law of the Double Complement
= [(wx +x) +7] +(y9 +2) Associative Law of +
= (x +2z)+ (9 +2) Absorption Law (and the
Commutative Laws of + and -)
x+(z+z)+y Commutative and
Associative Laws of +
=x+Zz+y Idempotent Law of +
Up to this point we have repeated for Boolean functions what we did in Chapter 2 for
statements. When given a Boolean function (in algebraic terms), we construct its table
of values. Now we consider the reverse process: Given a table of values, we shall find a
Boolean function (described in algebraic terms) for which it is the correct table.
15.1 Switching Functions: Disjunctive and Conjunctive Normal Forms 715
Given three Boolean variables x, y, z, find formulas for functions f, g, h: B* — B for the
EXAMPLE 15.4
columns specified in Table 15.4.
For the column under f we want a result that has the value 1 only in the case where
x = y =O and z = 1. The function f(x, y, z) =X yz is one such function. In the same
way, g(x, y, Z) = xyZ yields the value 1 for x = 1, y = z = 0, and is 0 in all other cases.
As each of f and g has the value 1 in only one case and these cases are distinct from
each other, their sum f + g has the value 1 in exactly these two cases. So A(x, y, z) =
T(x, y, z) + g(x, y, Z) =X yz + xyZ has the column of values given under h.
Table 15.4
x y z f 8 h
0 0 0 0 0 0
0 0 1 1 0 1
0 l 0 0 0 0
0 1 ] 0 0 0
l 0 0 0 1 1
1 0 1 0 0 0
1 ] 0 0 0 0
I 1 1 0 0 0
This example leads us to the following definition.
Definition 15.3 For all n € Z*, if f is a Boolean function on the ” variables x), x2, ..., X,, we call
a) each term x; or its complement x;, for 1 <i <n, a literal;
b) aterm of the form y; y2--- y,, where each y; = x; or X;, for 1 <i <n, a fundamental
conjunction, and
c) a representation of f as a sum of fundamental conjunctions a disjunctive normal
form (d.nf.) of f.
Although no formal proof is given here, the following examples suggest that each
f: B’ > B, f #0, has a unique (up to the order of fundamental conjunctions) repre-
sentation as a d.n.f.
Find the d.n.f. for f: B? > B, where f(x, y, z) = xy + Xz.
EXAMPLE 15.5
From Table 15.5, we see that the column for f contains four 1’s. They indicate the
four fundamental conjunctions needed in the d.n.f. of f, so f(x, y,z) =X yz +xyz+
xXyZ+Xxyz.
Another way to solve this problem is to take each product term appearing in f — namely,
xy and xz—and somehow involve whichever variables are missing. Using the proper-
ties of these variables, we have xy + XZ = xy(Z+Z) +. X(y + y)z (Why?) = xyz + xyzZ+
XyZ+X yz.
716 Chapter 15 Boolean Algebra and Switching Functions
Table 15.5
x y z xy XZ f
0 0 0 0 0 0
0 0 ] 0 ] ]
0 l 0 0 0 0
0 1 1 0 1 1
1 0 0 0 Q Q
1 0 1 0 0 0
1 ] 0 ] 0 1
1 ] 1 ] 0 1
Find the d.n.f. for g(w, x, y, Z) = wxy + wyz+ xy.
EXAMPLE 15.6
We examine each term, as follows:
a) wxy = wxy(Z+Z) = wxyzt+ wxyZ
b) wyZ = w(x + X)yzZ = wxyzZ + wxyz
c) xy = (w+ w)xy(z+Z) = wxyz t+ wxyzZ + wxyz t+ Wxyz
It follows from the idempotent property of + that the d.n.f. of g is
g(w, x,y,z) = wxyz+ wxyzZt+wxyz t+ wxyzZ + wxyz+ wWxyzZ t+ Wxyz.
Consider the first three columns in Table 15.6. If we agree to list the Boolean variables
in alphabetical order, we see that the values for x, y, z in any row determine a binary label.
These binary labels for 0, 1, 2,..., 7 arise forrows 1, 2,..., 8, respectively, as shown in
columns 4 and 5 of Table 15.6. [We note, for instance, that the first row has row number |
but binary label 000 (= 0). Likewise, the seventh row — where x = 1, y = 1, z = O—has
row number 7 but binary label 110 (= 6).] As a result, the d.n.f. of a nonzero Boolean
function can be expressed more compactly. For instance, the function f in Example 15.5
can be given by f = }° m(1, 3, 6, 7), where m indicates the minterms (that is, fundamental
conjunctions — each here on three literals) at rows 2, 4, 7, 8, with the respective binary labels
1, 3, 6, 7. The word minterm is used here to emphasize that the fundamental conjunction
has the value 1 a minimal number of times — namely, one time — without being identically
0. For example, m(1) denotes the minterm for the row with binary label 001 (= 1) where
Table 15.6
x y Zz Binary Label Row Number
0 0 0 000 (= 0) 1
0) 0 l 001 (= 1) 2
0 1 0 010 (= 2) 3
0 1 1 O11 (= 3) 4
l 0 0 100 (= 4) 5
1 0 1 101 (=5) 6
1 1 0 110 (= 6) 7
1 1 1 111 (=7) 8
15.1 Switching Functions: Disjunctive and Conjunctive Normal Forms 717
x = y = 0 and z = J; this corresponds with the fundamental conjunction x yz, which has
the value 1 for exactly one assignment (where x = y = O and z = 1).
Lacking a table, we can still represent the d.n.f. of the function g of Example 15.6, for
instance, as a sum of minterms. For each fundamental conjunction c,c2¢3c¢4, where c) = w
orwW,...,c¢4 = zorz, wereplace eachc;, 1 <i <4, by Oifc; is a complemented variable,
and by 1 otherwise. In this way the binary label associated with that fundamental conjunc-
tion is obtained. As a sum of minterms, we find that g = - m(6, 7, 10, 12, 13, 14, 15).
Dual to the disjunctive normal form is the conjunctive normal form, which we discuss
before closing this section.
Let f: B* > B be given by Table 15.7. A term of the form c; +c¢2 +3, where c; = x
EXAMPLE 15.7 or X, C2 = y or y, and c3 = z or Z, is called a fundamental disjunction. The fundamental
disjunction x + y + z has value | in all cases except where the value for each of x, y, z is
0. Similarly, x + y + z has value 1 except when x = z = Oand y = 1. Since each of these
Table 15.7 fundamental disjunctions has the value 0 in only one case, and these cases do not occur
simultaneously, the product (x + y + z)(x + y +z) has the value 0 in precisely the two
xi ylzif cases just given. Continuing in this manner, we may represent the function f as
0|;0;/01] 0
0;0/1]1 f=@+y4t20+y¥4+2zIG@+y4+2)
0};1;]0] 0
and we call this the conjunctive normal form (c.n.f.) for f.
Oi 1]1)]1
Since the fundamental disjunction x + y + z has the value | a maximum number of
1|/0);)0] 1
times (without being identically 1), it is called a maxterm, especially when we use a binary
1;0/]/141
row label to represent it. Using the binary labels to index the rows of the table, we may
1} 1) 0] 0
write f = || M(O, 2, 6), a product of maxterms.
1); 1)1 ]
Such a representation exists for each f # 1, and it is unique up to the order of the
fundamental disjunctions (or maxterms).
Let g: Bt > B, where g(w, x, y,z) =(w+x+y)\(x +¥4+2z)(w4+¥). To obtain the
EXAMPLE 15.8
c.n.f. for g, we rewrite each disjunction in the product as follows:
aywt+xt+yrwtxtytOrwt+tx+y+7z
=(wt+xt+yt+z)(wt+x+yt+2Z)
b)x+ytzr=ww+x+yrzr=(w+x+y+zwW+rxe+y+zZ)
chwtyH=wt+axaxty=(w+xty(wt+xt+y)
=(w+xt+yt7z)(wt+x+
yt zz)
=(wt+x+ytz(w+x+ytzZwt+xtyrzy(w+xt+ytZ)
Consequently, using the idempotent law of-, we have g(w, x, y, Zz) =(w+x+y4+2Z)-
(w+xtytzZ(w+xtytzyw+ex+ytzy(wt+x+yt+Z(w+x+ytz)-
(w+x+y+7Z).
To obtain g as a product of maxterms, we associate with each fundamental disjunction
d, +d) +43 + d, the binary number b,b2b3b4, where b; = 0 if dj = w; b; = 1 if d; =
W;...3 by = Oif dy =z; by = Lif dy =Z. As aresult,g =|] M(0, 1, 2, 3, 6, 7, 10).
Our last example in this section reviews what we have learned about the ways to represent
a nonconstant Boolean function f (that is, f # O and f # 1).
718 Chapter 15 Boolean Algebra and Switching Functions
If h(w, x, y, z) = wx + Wy + xyz, then we may rewrite each summand in /: as follows:
EXAMPLE 15.9
i) wx = wx(y + y)(Z+Z) = wxyz + wxyz + wxyz + wxyz
ii) Wy = W(x +X) y(2 +7) = Wxyzt+ WxyZ+ Wxyz+ WXYZ
iil) xyz = (wWt+w)xyz = wxyzt+Wwxryz
Using the idempotent law of +, we find that the d.n-f. for A is
wxyZ t+ wxyZ t+ wxyz + wxyz t+ wxyz + wxyz+ wWxyzt+ wWxyzt+ wrxyz.
Considering each fundamental conjunction in the d.n.f. for h, we obtain the following bi-
nary labels and minterm numbers:
wxyz: 1111 (= 15) wxyz: 1100 (= 12) wxyz: 0011 (= 3)
wxyz: 1110 (= 14) wxyz: Olll (=7) wxyz: 0010 (= 2)
wxyz: 1101 (= 13) wxyz: 0110 (= 6) wxyz: 1011 (= 11)
So we may write h = - m(2, 3,6, 7, 11, 12, 13, 14, 15). And from this representation
using minterms we have = I] M(0, 1, 4, 5, 8, 9, 10), a product of maxterms.
Finally, we take the binary label for each maxterm and determine its corresponding
fundamental disjunction:
0=0000: wt+xt+y+z 8=1000: w+xt+yt+z
1=0001: w+xt+y+zZ 9=1001: W+xt+y+z
4=0100: w+x+yt+z 10= 1010: w+x+y+z
5=0101: w+x¥+yt+2Z
This tells us that the c.n.f. for / is
(Wtx+y+z(wt+xtytZ(wt+txtytz(wt+x+yt+zZ):
(w+xtytzWwt+xtyt+zZWwt+xetytz).
Hence,
wWxyZ + WXYZ + WXYZ + WXYZ + WXYZ + WXYZ + WXYZ + WXYZ + WXYZ =
y~ m(2, 3, 6, 7, 11, 12, 13, 14, 15) = [] MC, 1, 4, 5, 8, 9, 10) =
(w+xty+tz(w+xrty+Zw+xty+z(w+xtyt+z)-:
(w+tx+tytzwerxtytzwtx«+ytz).
ments of values for w and y that will result in the value 1 for
EXERCISES 15.1 the expression.
1. Find the value of each of the following Boolean expressions a) x+xy+w b) xy +w
if the values of the Boolean variables w, x, y, and z are 1, 1, 0, c) xy +xw d) xy+w
and 0, respectively. 3. a) How many rows are needed to construct the (function)
a) xy+xy b) w+xy ec) wx + ¥t+ yz table for a Boolean function ofn variables?
d) (wx +yZ)+wy+(wt+y)@+y) b) How many different Boolean functions of n variables
2. Let w, x, and y be Boolean variables where the value of x are there?
is 1. For each of the following Boolean expressions, determine, 4, a) Find the fundamental conjunction made up from the
if possible, the value of the expression. If you cannot determine variables w, x, y, Z, or their complements, where the value
the value of the expression, then find the number of assign- of the conjunction is 1 precisely when
15.2. Gating Networks: Minimal Sums of Products: Karnaugh Maps 719
i} w=x=0,y=z=1. 11. Simplify the following Boolean expressions,
i) w=O0,x =1,y=1,z7=0. a)xy+(x+y)zZ+y
iii) w=O,x =y=z=1.
iv) w=x=y=z=0. byx+y+@+y+z)
b) Answer part (a) this time for fundamental disjunctions,
c) yz twx+z2+{wz(xry + wz)]
instead of fundamental conjunctions, where the value of 12. Find the values of the Boolean variables w, x, y, z that sat-
each fundamental disjunction is 0 precisely for the stated isfy the following system of simultaneous (Boolean) equations.
values of w, x, y, Z. x+xy =0 xy =Xz Xy+XZ+7w
= zw
5. Suppose that f: B’ > B is defined by 13. a) For
f, g, 4: BY > B, prove that fg + fh+gh =
SX, YZ) = (+ y) + 2). fg + fh and that fg + fe+ fet fg=1.
a) Determine the d.n.f. and c.n-f. for f. b) State the dual of each result in part (a).
b) Write f as a sum of minterms and as a product of max- 14. Let f, g: B" ~ B. Define the relation “<” on F,,, the set of
terms (utilizing binary labels). all Boolean functions of » variables, by f < g if the value of ¢g
is 1 at least whenever the value of f is 1.
6. Let g: B41 — B be defined by
a) Prove that this relation is a partial order on F,.
8(wW,X, y, 2) = (wz + XyZ)\(x + xyz).
b) Prove that fg < fandf<ft+g.
a) Find the d.n.f. and c.n.f. for g.
c) Forn = 2, draw the Hasse diagram for the 16 functions
b) Write g as a sum of minterms and as a product of max- in F,. Where are the minterms and maxterms located in the
terms (utilizing binary labels). diagram? Compare this diagram with that for the power set
7. Let Fg denote the set of all Boolean functions f: B° > B. of {a, b, c, d} partially ordered under the subset relation.
(a) What is | ¥,|? (b) How many fundamental conjunctions (dis- 15, Define the closed binary operation @ (Exclusive Or) on F,,,
junctions) are there in F;,? (c) How many minterms (maxterms)
the set of all Boolean functions on n variables, by f 6g =
are there in F,? fg+f92, where f, g: B” > B.
8. Let f: B* — B. Find the disjunctive normal form for f if a) Determine f@ f, fOf, f Ol, and f GO.
a) f-'(1) = {0101 (that is, w =0,x =1, y =0,z=1), b) Prove or disprove each of the following.
0110, 1000, 1011}.
i) feg=-0>f=8
b) f~'@) = {0000, 0001, 0010, 0100, 1000, 1001, 0110}.
ii) fO(g@h)=(f Og) eh
9. Let B” — B. If the d.n.f. of f has m fundamental conjunc- iii) fOg= fox
tions and its c.n.f. has k fundamental disjunctions, how are m, iv) fBgh=(f @g\(f Sh)
n, and & related? v) f(g @h) = fe@ fh
10. Ifx, y, and z are Boolean variables and x + y +z = xyz, vi) (f@g)=fes=feg
prove that x, y, z all have the same value. vii) f@g=f@hsaegah
15.2
Gating Networks: Minimal Sums
of Products: Karnaugh Maps
The switching functions of Section 15.1 present an interesting mathematical theory. Their
importance lies in their implementation by means of logic gates (devices in a digital com-
puter that perform specified tasks in the processing of data). The electrical and mechanical
components of such gates depend on the state of the art; we shall not concern ourselves
here with questions relating to hardware.
Figure 15.1 contains the logic gates for negation (complement), conjunction, and dis-
junction in parts (a), (b), and (c), respectively. Since the Boolean operations of + and - are
associative, we may have more than two inputs for an AND gate or an OR gate.
Figure 15.2 shows the logic, or gating, network for the expression (w + x)(y + xz).
Symbols on a line to the left of a gate (or inverter) are inputs. When they are on line
720 Chapter 15 Boolean Algebra and Switching Functions
_ x —> x —>
co] oi ew x+y
yY—> y —,
(a) Inverter {b) AND gate (c) OR gate
Figure 15.1
as So
ee y+xz
XZ
|} (w+ x)(y + xz)
> x
x —
wt x
WO
Figure 15.2
segments to the right of a gate, they are outputs. We have split the input line for x, so that
x may serve as input for both an AND gate and an inverter.
The exercises will provide practice in drawing the logic network for a Boolean expression
and in going from the network to the expression. Meanwhile certain features of these
networks need to be emphasized.
1) An input line may be split to provide that input to more than one gate.
2) Input and output lines come together only at gates.
3) There is no doubling back; that is, the output from a gate g cannot be used as an input
for the same gate g or for any gate (directly or indirectly) leading into g.
4) We assume that the output of a gating network is an instantaneous function of the
present inputs. There is no time dependence and we attach no importance to prior
inputs, as we do with finite state machines.
With these ideas in mind, let us analyze the computer addition of binary numbers.
When we add two bits (binary digits), the result consists of a sum s anda carry c. In three of
EXAMPLE 15.10 four cases the carry is 0, so we shall concentrate on the computation of 1 + 1. Examining
parts (b) and (c) of Table 15.8, we consider the sum s and the carry c as Boolean functions
of the variables x and y. Thenc = xy ands =xy+xy =x @y = (x + yy). (Recall
that 6 denotes exclusive OR.)
Table 15.8
x y Binary Sum x y Sum x y Carry
0 0 0+0=0 0 0 0 0 0 0
0 I O+1=1 0 1 1 0 1 0
1 0 1+0=1 I 0 1 1 0 0
1 1 1+1=10 1 1 0 1 1 1
(a) (b) ()
Figure 15.3 is a gating network with two outputs. It is referred to as a multiple output
network. This device, called a half-adder, implements the results in parts (b) and (c) of
15.2 Gating Networks: Minimal Sums of Products: Karnaugh Maps 721
x—> x+y
T+ ste
yy —
“Dab
y—>
> Xy
C= xy
v
The half-adder
Figure 15.3
Table 15.8. Using two half-adders and an OR gate, we construct the full-adder shown in
Fig. 15.4(a). Ifx = x,X,-1 ...X2xX)X9 and y = yy V_—1... Y2¥1 yo, consider the process of
adding the bits x; and y, in finding the sum x + y. Here c;_, is the carry from the addition
of xj;-) and y;_, (and a possible carry c,_2). The input c;_;, together with the inputs x; and
yi, produce the sum s; and the carry c; as shown in the figure. Finally, in Fig. 15.4(b) two
full-adders and a half-adder are combined to produce the sum of the two binary numbers
X2X1X9 and y2 yj yo, whose sum is C2525) 50.
S=S, OG 1
A [a —
Cc, 4 ——__—______> [> Xy —> So
H.A. H.A |
si=x, @y, Cx, @ y) xX, >| F.A.
Cy
x, —> V1 ————> -— t— S>
H.A. C= X,Y,
Y; > > C= x9 F.A.
XY, + Cx, ® y,) Y¥2.~-—— -— C>
(a) The full-adder (b)
Figure 15.4
The next example introduces the main theme of this section—the minimal-sum-of-
products representation of a Boolean function.
Find a gating network for the Boolean function
EXAMPLE 15.11
f(w. x,y, 2) = >> m(4, 5,7, 8,9, 11).
Consider the order of the variables as w, x, y, z. We can determine the d.n.f. of f
by writing each minterm number in binary notation and then finding its corresponding
fundamental conjunction. For example, (a) 5 = 0101, indicating the fundamental con-
junction wx yz; and (b) 7 = O111, indicating wxyz. Continuing in this way, we have
f (w, x, y, Z) = WXYZ + WxYZ + WXYZ + Wx VZ + WXYZ + WXYZ.
Using properties of Boolean variables, we find that
f =wxz¥t+y)+uxy(Z4+2)+ wxyz+ wxyz
=wWxz+wxy+ wxyz+ wxyz = wx(z + yz) + wx + yz)
= wx(z + y) + wx(¥ +z) (Why?) = wx(yv +z) + wx(y +2),
so
722 Chapter 15 Boolean Algebra and Switching Functions
a) f(w. x,y, 2) = WxZ + WxY+ wKY+ wXz; or
b) f(w, x,y,z) = Wx(y +z) + wx(+ y2).
In Example 15.11, the result
fw, x, y,Z) = WXZ + WXY + wXY + wxzZ
is often referred to as a minimal-sum-of-products representation for the function
f(w, x,y, 2= S- m(4, 5, 7, 8, 9, 11). We see that this representation is a sum of four
products — where each product is made up of three literals. When we call such a represen-
tation minimal we mean two things:
1) Any possible further modification will result in a representation that is not a sum of
such products; and
2) If f can be represented in a second way as a sum of products (of literals), then we
will have at least four product terms — each with at least three literals.
[Note: A minimal sum of products for a given Boolean function f (4 0) need not be
unique
— as we shall find in Example 15.15.]
In this text our discussion of this idea will be somewhat informal. We shall not attempt to
prove that each nonzero Boolean function has such a minimal-sum-of-products representa-
tion. Instead we shall assume the existence of this representation and simply continue our
study of how to obtain such a result.
From this point on we shall consider an input of the form w as an exact input, which has
not passed through any gates, instead of regarding it as the result obtained from inputting
w and passing it through an inverter.
In Fig. 15.5(a), we have a gating network implementing the d.n.f. of the function f
in Example 15.11. Part (b) of the figure is the gating network for f as a minimal sum of
products. Figure 15.5(c) has a gating network for f = wx(yv +z) + wx(y 4+ 2z).
The network in part (c) has only four logic gates, whereas that in part (b) has five such
devices. Consequently, we may feel that the network in part (c) is better with regard to
minimizing cost because each extra gate increases the cost of production. However, even
though there are fewer inputs and fewer gates for the implementation in part (c), some of
the inputs (namely, y and z) must pass through three /evels of gating before providing the
output f. For the minimal sum of products in part (b), there are only two levels of gating. In
the study of gating networks, outputs are considered instantaneous functions of the input.
In practice, however, each level of gating adds a delay in the development of the function
f. For high-speed digital equipment we want to minimize delay, so we opt for more speed
at the price of increased manufacturing cost.
It is this need to maximize speed that makes us want to represent a Boolean function
as a minimal sum of products. In order to accomplish this for functions of not more than
six variables, we use a pictorial method called the Karnaugh map, developed in 1953 by
Maurice Karnaugh (1924 — ). Karnaugh maps always produce forms with at most two levels
of gating, and we shall find that the d.n.f. of a Boolean function is a major key behind this
technique.
In simplifying the d.n.f. of f in Example 15.11, we combined the two fundamen-
tal conjunctions wxyz and wxyz into the product term wxz because wxyz + Wxyz =
wxz(y + y) = wxz(1) = wxz. This indicates that if two fundamental conjunctions differ
in exactly one literal, then they can be combined into a product term with that literal missing.
15.2 Gating Networks: Minimal Sums of Products: Karnaugh Maps 723
Sl
><
N
Wxyz
S|
Nx
wXxyz w
La x
y
$n 7
wxyz | -———— fw, X, ¥, 2) w
x
y
WXYZ
x13
N
a
Wxyz
(bd)
Fw, X, ¥, 2)
Level 1 Level 2 Level 3
(C)
Figure 15.5
For g: B4 — B, where g(w, x, y, Z) = wxyz + wxyz + wxyz+ wxyz, each funda-
mental conjunction (except the first) differs from its predecessor in exactly one literal. Here
we can simplify g as g = wxy(Z +z) + wxy(z+Zz) = wxytuxy =wx(yty) = wx.
Table 15.9 We could have also written
w\x 0 1 8 = wx(VZtyet yet yz) = wx(y + y)(Z + Z) = wx.
0 The key to this reduction process is the recognition of pairs (quadruples, ... , 2”-tuples)
l 1 of fundamental conjunctions where any two adjacent terms differ in exactly one literal. If
h: B* — B, and the d.n-f. of A has 12 terms, can we move these terms around to recognize
(a) wx
the best reductions? The Karnaugh map organizes these terms for us.
We start with the case of two variables, w and x. Table 15.9 shows the Karnaugh maps
for the functions f(w, x) = wx and g(w, x) = w + x. (The 0’s are suppressed in the tables
for these maps.)
In part (a), the 1 interior to the table indicates the fundamental conjunction wx. This
occurs in the row for w = 1 and the column for x = 1, the one case when wx = 1. In
724 Chapter 15 Boolean Algebra and Switching Functions
part (b), there are three 1’s in the table. The top 1 is for wx, which has the value 1 exactly
when w = 0, x = 1. The bottom two 1’s are for wx and wx, as we read the bottom row
from left to right.
Table 15.9(b) represents the d.n.f. x + wx + wx. As a result of their adjacency in
the bottom row, the table indicates that wx and wx differ in only one literal and can.be
combined to yield w. By the idempotent law of addition (which is so crucial in working
with Karnaugh maps), we can use the same fundamental conjunction wx a second time
in this reduction process. The adjacency in the second column of the table indicates the
combining of wx and wx to get x. (In the x column all possibilities for w— namely,
w and w— appear. This is a way to recognize x as the result for that column.) Thus
Table 15.9(b) illustrates that wx + wx + wx = wx + wx + wx + wx = (wx + wx) t+
(wx + wx) = wXeX+x)4+ W4+w)x =wt+d)x=w+rx.
We now consider three Boolean variables w, x, y. In Table 15.10, the first new idea we
EXAMPLE 15.12
encounter is in the column headings for x y. These are not the same as the headings we had
for the rows in the function tables. We see here, in going from left to right, that 00 differs
from 01 in exactly one place, 01 differs from 11 in exactly one place, 11 differs from 10 in
exactly one place, and, upon wrapping around, 10 differs from 00 in exactly one place.
Table 15.10
w\xy | 00 O01 11 10
If f(w, x,y) = >> m(O, 2, 4, 7), then because 0 = 000(@xXY), 2 = 010(Wxy), 4 =
100(wxy), and 7 = 111(wxy), we can represent these terms by placing 1’s as shown in
Table 15.10. The | for wx y is not adjacent to any other | in the table, so it is isolated; we
shall have wxy as one of the summands in the minimal sum of products representing f. The
1 for Wxy (at the right end of the first row) is not isolated, for once again we consider the
table as wrapping around, making this 1 adjacent to the 1 for wxy (at the left end of the first
row). These combine (under addition) to give us wxy + WXY = Wy(xX +X) = wy) =
wy. Finally, the 1’s in the column for x = y = 0 indicate a reduction of wxy + wxy to
(w+ w)xy = (1)xy = xy. Hence, as a minimal sum of products, f = wxy + wy +X.
From the respective parts of Table 15.11 we have
| EXAMPLE 15.13
a) f(w,x, y) = >> m(0, 2, 4, 6) = © m0, 4) + Yo m(2, 6) = (XY + wxy) +
(xy +wxy) =(W+w)xytWrw)xy = ()xyt+ xy H=xytxy =
(x + x)y = (1)y = y, the only variable whose value does not change when the
four terms designated by the 1’s are considered. [The value of y is Q here, so
fw, x, y)=y.]
b) f(w, x,y) = ¥> m(O, 1, 2,3) = WxY + Wry + Wxy + Wry = WKY +x +
xytxy)=wet+xy)Ot+y) =wi)) =w.
c) f(w, x, y) => md, 2, 3,5, 6,7) = © m1, 3, 5,7) +) m(2, 3, 6.7) =y tx.
15.2. Gating Networks: Minimal Sums of Products: Karnaugh Maps 725
Table 15.11
w\xy]00 01 u 10]) w\xy| 00 01 uu 10]] w\sy | 00 01 M10
ye ||) con)
(a) (b) (c)
Advancing to four variables, we consider the following example.
Find a minimal-sum-of-products representation for the function
EXAMPLE 15.14
f(w,x,y, z= S > m0, 1, 2, 3, 8, 9, 10).
The Karnaugh map for f in Table 15.12 combines the 1’s in the four (adjacent) corners to
give the term Wx yz+ wxyz + wxyzZ+ wxyz =xXz(wWy + Wy + wy t wy) = xz. The
four 1’s in the top row combine to give wx. (Using only the middle two 1’s, we do not
make use of all the available adjacencies and get the term wxz, which has one more literal
than wx.) Finally, the 1 in the row (w = 1, x = 0) and the column (y = 0, z = 1) can be
combined with the 1 on its left, and these can then be combined with the first two 1’s in
the top row to give Wx yz + Wxyz+ wxyz+wxyz =xXy. Hence, as a minimal sum of
products, f(w, x, y,Z) =xXZ+WxX+XYy.
Table 15.12
wx\yz | 00 Ol I 10
00
01
11
10
The map for f(w, x, y, z) = S- m(9, 10, 11, 12, 13) appears in Table 15.13. The only
EXAMPLE 15.15 1 in the table that has not been combined with another term is adjacent to a 1 on its right (this
combination yields wxz) and to a 1 above it (this combination yields wyz). Consequently,
we can represent f as a minimal sum of products in two ways: wxy + wxy + wxz and
wxy + wxy + wyz. This type of representation, then, is not unique. However, we should
Table 15.13
wx \ yz 00 O1 ll 10
00
01
1]
0 (oO
726 Chapter 15 Boolean Algebra and Switching Functions
observe that the same number of product terms and the same total number of literals appear
in each case.
There is a right way and there is a wrong way to use a Karnaugh map.
EXAMPLE 15.16 Let f(w, x, y,z) = > m(3, 4,5, 7,9, 13, 14, 15). In Table 15.14(a) we combine a
block of four 1’s into the term xz. But when we account for the other four 1’s, we do what
is shown in part (b). So the result in part (b) will yield f as a sum of four terms (each with
three literals), whereas the method suggested in part (a) adds the extra (unneeded) term xz.
Table 15.14
ux\yz | 00 01 10] [ wx\yz] 00 o1 11 10
00 1 00
01 1 01 D
aq
1 1 1 <>
10 1 10
(a) (b)
The following suggestions on the use of Karnaugh maps are based on what we have done
so far. We state them now so that they may be used for larger maps.
1) Start by combining those terms in the table where there is at most one possibility for
simplification.
2) Check the four corners of a table. They may contain adjacent 1’s even though the 1’s
appear isolated.
3) In all simplifications, try to obtain the largest possible block of adjacent 1’s in order to
get a minimal product term. (Recall that 1’s can be used more than once, if necessary,
because of the idempotent law of +.)
4) If there is a choice in simplifying an entry in the table, try to use adjacent 1’s that
have not been used in any prior simplification.
EXAMPLE 15.17 If f(v,w, x,y, z= > m(1, 5, 10, 11, 14, 15, 18, 26, 27, 30, 31), we construct two
4 x 4 tables, one for v = 0, the other for v = 1. (See Table 15.15.)
Table 15.15
wx\yz | 00 01 1 10] | wx\yz] 00 01 1 10
00 00
Ol 01
i U coi
10 10 Ld
(v = 0) (v= 1)
Following the order of the variables, we write, for example, 5 = 00101 in order to indi-
cate the need for a | in the second row and second column of the table for v = 0. The other
five 1’s in the table where v = 0 are for the minterms for 1, 10, 11, 14, 15. The minterms for
15.2 Gating Networks: Minimal Sums of Products: Karnaugh Maps 727
18, 26, 27, 30, 31 are represented by the five 1’s in the table where v = 1. After filling in all
the 1’s, we see that the | in the first row, fourth column of the table for v = 1 can be combined
with another term in only one way — with vwx yz — yielding the product vx yz. This is also
true for the two 1’s in the second column of the (v = Q) table. These give the product v w yz.
The block of eight 1’s yields wy, and we have f(v, w, x, y,z) = wy + UWyz+ UXYZ.
A function f of the six variables t, v, w, x, y, and z requires four tables — one for each of
the cases (a) f = 0, v = 0; (b) t = 0,v = 1; (c)t = 1,v = 1; and (d) t = 1,v = 0. Beyond
six variables, this method becomes overly complicated. Another procedure, the Quine-
McCluskey Method, can be used. For a large number of variables the method is tedious to
perform by hand, but it is a systematic procedure suitable for computer implementation,
particularly for computers possessing some type of “binary compare” command. (More
about this technique is given in Chapter 7 of Reference [3].)
We close this section with an example involving the dual concept— namely, a minimal
product of sums.
For g(w, x,y,z) = I] M(1, 5, 7, 9, 10, 13, 14, 15), this time we place a 0 in each of the
EXAMPLE 15.18
positions for the binary equivalents of the maxterms listed. This yields the results shown in
Table 15.16 (where the 1’s are suppressed).
Table 15.16
|o ||
wx \ yz 00 Ol Il 10
00
11
The 0 in the lower right-hand corner can only be combined with the 0 above it, and
so we have (W+x+Y¥+z(Wt+xX+yt+z=Wt+yt+z4+xx=(W+y+72)+0=
w +y-+2z. The block of four 0’s (for the maxterms for 5, 7, 13, 15) simplifies to ¥ + Z,
whereas the four 0’s (for the maxterms for 1, 5, 9, 13) in the second column yield y + z. So
g(w,x, y,z) = (W+y¥+z@+7(y + Z), a minimal product of sums.
3. Answer Exercise 2, replacing NAND by NOR.
EXERCISES 15.2 .
4. Using inverters, AND gates, and OR gates, construct
gating networks for
1. Using inverters, AND gates, and OR gates, construct the
gates shown in Fig. 15.6. a) f(y, 7) =xz+yzZ+x
2. Using only NAND* gates (see Fig. 15.6), construct the in-
b) gQ,y, 2) = (+2) + 2)%
verter, AND gate, and OR gate. c) h(x, y, z) = (xy @ yz)
*The NAND gate is constructed in a very simple manner from transistors — both in the old-fashioned technol-
ogy of semiconductors as well as in the more recent techniques of silicon chip fabrication, Furthermore, most of
the gating networks that represent what is actually happening inside of today’s computers contain large numbers
of these NAND gates.
728 Chapter 15 Boolean Algebra and Switching Functions
x
y om f(x,y) =x@Oy
EXCLUSIVE-OR gate
x —>
pe aten
yY —
y tl
% Y
g(x,
y) = xy
NAND gate
xX —>
Atx, y)
yy —>
A(x, y)=xt+y
NOR gate
(b)
Figure 15.6
Figure 15.8
d) f(w, x, y,z) = >, m6, 6, 8, 11, 12, 13, 14, 15)
5. For the network in Fig. 15.7, express f as a function of
w,Xx, y, Zz. e) f(w, x,y,z) = >. mC, 9, 10, 11, 14, 15)
f) f(v, wx, y, 2) =
6. Implement the half-adder of Fig. 15.3 using only (a) NAND m(1, 2, 3, 4, 10, 17, 18, 19, 22, 23, 27, 28, 30, 31)
gates; (b) NOR gates.
10. Obtain a minimal-product-of-sums representation for
7. For each of the networks in Fig. 15.8 express the output in fiw, x,y, =] ] MO, 1, 2, 4,5, 10, 12, 13, 14).
terms of the Boolean variables x, y or their complements. Then 11. Let f: B" > B be a function of the Boolean variables
use the expression for the output to simplify the given network. X1,.X2,...,X,. Determine nv if the number of 1’s needed to
8. For each of the following Boolean functions f, design a express x, in the Karnaugh map for f is (a) 2; (b) 4; (c) 8;
two-level gating network for f as a minimal sum of products. (d) 2, fork € Zt with 1 <k<n—-1.
a) f: B> — B, where f(x, y, z) = lifand only if exactly 12. If g: B’ > B is a Boolean function of the Boolean vari-
two of the variables have the value 1. ables x1, X2, ..., ¥7, how many 1’s are needed in the Karnaugh
map of g in order to represent the product term (a) x1; (b) x142;
b) f: B4 — B, where f(w, x, y, z) = 1 if and only if an
(C) X1 X23; (Gd) x1X3X5X7?
odd number of variables have the value 1.
13. In each of the following, f: B* > B, where the Boolean
9, Find a minimal-sum-of-products representation for
variables (in order) are w, x, y, and z. Determine | f~'(0)| and
a) f(w,x, y) = >> m(1, 2,5, 6) | f-'(1)| if, as a minimal sum of products, f reduces to
b) fiw, x, y) =|] MO, 1,4, 5) a)x b) wy Cc) wyz
c) f(w, x,y,z) =) m(0, 2, 5,7, 8, 10, 13, 15) dj) x+y e) xy +2 f) xyz+w
w—>
x —>
_
y ——> |
|or
ao,
Z
v
Figure 15.7
15.3 Further Applications: Don’t-Care Conditions 729
15.3
Further Applications:
Don’'t-Care Conditions
Our objective now is to use the ideas we have developed in the first two sections ina variety
of applications.
As head of the church bazaar, Paula has volunteered to leave her automobile dealership
EXAMPLE 15.19
early one evening in order to bake a cake that will be sold at the bazaar. Members of the
bazaar committee volunteer to donate the needed ingredients as shown in Table 15.17.
Table 15.17
Flour | Milk | Butter | Pecans | Eggs
Sue x x
Dorothy x x
Bettie x x
Theresa x x
Ruthanne x x x
Paula sends her daughter Amy to pick up the ingredients. Write a Boolean expression to
help Paula determine which (minimal) sets of volunteers she should consider so that Amy
can collect all of the necessary ingredients.
Let s, d, b, t, and r denote five Boolean variables corresponding, respectively, to the
women listed in the first column of the table. To get the flour, Amy must visit Sue or Bettie.
In Boolean terminology, we can say that flour determines the sum s + b. This term will be
part of a product of sums. For the other ingredients, the following sums denote the choices.
milk:b+t+r butter:s+d+r pecans:d +r eggs: t
To answer the question posed here, we seek a minimal sum of products for the function
f(s,d,b,t,r) =(s+b)\(b+t+r)(s+d+r)(d +r)t. The answer can be obtained by
multiplying everything out and then simplifying the result, or by using a Karnaugh map.
This time we’ll use the map (in Table 15.18).
Table 15.18
db\rr|00 o1 1 10] [ ab\ir] oo or 1 10
0 |0 0 0 0} jo |oo1 0
01 0 o 1 0| Jor 0 0 1 0
1 oo 1 4] fu 0 0 1 1
10 |0 0 0 0| |/i0o |o 0 1 1
(s = 0) (s = 1)
We are starting with f as a product (not minimal) of sums. Consequently, we first fill in
the 0’s of the table as follows: Here s + b, for example, is represented by the eight 0’s in
the first and fourth rows of the table for s = 0— these are the eight assignments for s, d, b,
730 Chapter 15 Boolean Algebra and Switching Functions
t, r where s + b has the value 0; for t we need the 16 0’s in the first two columns of both
tables. After filling in the 0’s for the other three sums in the product, we then place a 1 in
the nine remaining spaces and arrive at the table shown. Now we need a minimal sum of
products for the nine 1’s in the table. We find the result is srt + sdt + brt + dbt. (Verify
this.) Therefore, Amy can be sent to collect the ingredients in one of four ways. She may
call upon Sue, Ruthanne, and Theresa — or, perhaps, Dorothy, Bettie, and Theresa — or she
may follow through with one of her other two options.
In our next application, we examine a certain property of graphs. This property was
introduced earlier in Supplementary Exercise 10 of Chapter 11. The development here,
however, does not rely on that prior presentation.
Definition 15.4 Let G = (V, E) denote a graph (undirected) with vertex set V and edge set FE. A subset D
of V is called a dominating set for G if for every v € V, either v € D or v is adjacent toa
vertex in D.
For the graph shown in Fig. 15.9, the sets {a, d}, {a, c, e} and {b, d, e, f} are examples
of dominating sets. The set {a, c, e} is a minimal dominating set, for if any of the three
vertices a, c, or e is removed, the remaining two no longer dominate the graph. The set
{a, d} is also minimal, but {b, d, e, f} 1s not because {b, d, e} already dominates G.
b C
f
Figure 15.9
For the graph shown in Fig. 15.9, let the vertices represent cities and the edges highways.
EXAMPLE 15.20
We wish to build hospitals in some of these cities so that each city either has a hospital or
is adjacent to a city that does. In how many ways can this be accomplished by building a
minimal number of hospitals in each case?
To answer this question, we need the minimal dominating sets for G. Consider vertex
a. To guarantee that a will satisfy our objective, we must build a hospital in a, or b, or d,
or f (since b, d, and f are all adjacent to a). Hence we have the terma +b+d-+ f. For
b to satisfy our objective, we generate the term a + b + c + d. Continuing with the other
four locations, we find that the answer is then a minimal-sum-of-products representation for
the Boolean function g(a, b,c, d,e, f) =(at+bt+d+4+ fi\(a@t+b+c4+d)(b+c4+d)-
(at+b+c+d+ej\(d+e+ f)(a+e-+ f). Using the properties of Boolean variables,
we have
g=(atb+d+fyb+c+d): Absorption Law
(d+et+fiatert f)
=[(at+ fie+ (b+ d)ida+(e+ f)] Distributive Law of + over -,
and the Commutative Law of +
15.3 Further Applications: Don’t-Care Conditions 731
= {act foet+b+dlidat+et f] Distributive Law of - over +
=acda+ace+acf+ feda+ fee+ fef Distributive Law of « over +
+ bda
+ be + bf +dda+de+df
= ace + (acf + acdf + cef +cf) Commutative and Associative
+ (acd + abd + ad) + be + bf Laws of + and -, and the
+de+df Idempotent Law of -
= ace +cf +ad
+ be +bf +de+df Absorption Law -
Consequently, in six of the cases the objective can be achieved by building only two
hospitals. If a and c have the largest populations and we want to locate hospitals in each of
these cities, then we would also have to construct a hospital at e.
The next application we shall examine introduces the notion of “don’t-care” conditions.
The four input lines for the gating network shown in Fig. 15.10 provide the binary equiva-
EXAMPLE 15.21
lents of the digits 0, 1, 2, ..., 9, with each number represented as abce (e is least signifi-
cant). Construct a gating network with two levels of gating such that the output function f
equals | for the input that represents the digits 0, 3, 6, 9 (that is, f detects digits divisible
by 3).
Table 15.19
a|b|{cle|f a|bicle|f
0;0/0/0/ 1 1/olololo
0/0/0/1/] 0 1/o0!ol/]1]41
01;0/;]1j,0)] 0 1;0/;/11]0] x
O;O0;1)]1 4] 1 1/o0/1!11!1*~x
0; 1;0); 0] 0 1;/1/0/0|] x
5. Multiple
<__,| of three L-—» f O;} 1/0} 110 1}/1/0/1]x
e ——»| detector O;1)1/0) 1 1/1/:1/0!1~x
Figure 15.10 O;1)1}1)]0 1} i1}1/1)*x
Before concluding that f = 0 for the other 12 cases, we examine Table 15.19, where an
“” appears for the value of f in the last six cases. These input combinations do not occur
(because of certain external constraints), so we don’t care what the value of f is in these sit-
uations. For such occurrences, the outputs are referred to as unspecified and f is called
incompletely specified. Therefore, we write f = >> m(0, 3, 6, 9) + d(10, 11, 12, 13,
14, 15), where d(10, 11, 12, 13, 14, 15) denotes the six don’t-care conditions for the rows
with the binary labels for 10, 11, 12, 13, 14, 15. When seeking a minimal-sum-of-products
representation for f, we can use any or all of these don’t-care conditions in the simplification
process.
From the Karnaugh map in Table 15.20, we write f as a minimal sum of products,
obtaining
f =abc@ + bce + bce +e.
The first summand in f is for recognition of 0; bce provides recognition for 3 because
it stands for 0011 (abce), since 1011 (abce) does not occur. Likewise bc@ is needed to
recognize 6, whereas ae takes care of 9. Figure 15.11 provides the interior details (minus
732 Chapter 15 Boolean Algebra and Switching Functions
the inverters) of Fig. 15.10. (Note that in Table 15.20 there are some don’t-care conditions
that were not used.)
YI
DAH!
nani
Table 15.20
&
ab \ ce 00 O1 11 10
DBO
00 @ (1)
01 : ()
%
11 x
Figure 15.11 10 a x
We close this section with one more example on how to use don’t-care conditions.
Find a minimal-sum-of-products representation for the incompletely specified Boolean
EXAMPLE 15.22
function
f(w, x,y,z) = >" m(, 1, 2, 8, 15) +d(9, 11, 12).
Consider the Karnaugh map in Table 15.21. As in the previous examples each minterm
is represented by a 1 in the table; each don’t-care condition is designated by an xX. The
1 representing wxyz (at the right end of the first row) can be simplified in only one
way — using the “adjacent” 1 for wx yz. This gives us xyz + wxyz=wWxzyt+y)=
wx z. Likewise the 1 for the fundamental conjunction wxyz is only adjacent to an X —
for the don’t-care condition wx yz. This adjacency simplifies to wxyz + wX yz = wyz.
Finally, the remaining 1’s for the fundamental conjunctions wx yz and wx yz can be
used with the minterm for 0— namely, wx yz—and the don’t-care condition wx yz.
This gives us Wx yz + wWxYZ+ WXYZ+ wryz = (Wz+wz7+ W724 wz)xy =
(w+ WZ+Z)xXV = XY.
Table 15.21
wx \yz | 00 O01 IL 10
00 QD MD)
01
0
11 x
[Note the following:
1) In the third simplification we used the fundamental conjunction wx yz a second
time. It was also used in the first simplification since it is adjacent to the fundamental
conjunction wx yz. However, this does not present a problem here because of the
idempotent law of +.
15.4 The Structure of a Boolean Algebra (Optional) 733
2) The don’t-care condition for wx yz was not used.|
Consequently, as a minimal sum of products, f(w, x, y, z) = - m(Q, 1, 2, 8, 15) +
d(9, 11, 12) = > m(O, 1, 2, 8, 15) + d(9, 11) = WXZ+ wyz+ XV.
c) fv, w, x,y,z) =
eae S> m(0, 2, 3, 4,5, 6, 12, 19, 20, 24, 28) + d(1, 13, 16, 29, 31)
4. The four input lines for the gating network shown in
1. For his tenth birthday, Mona wants to buy her son Jason
Fig. 15.12 provide the binary equivalents of the numbers
some stamps for his collection. At the hobby shop she finds
0,1, 2,..., 15, where each number is represented as abce,
six different packages (which we shall call u, v, w, x, y, z).
with e the least significant bit.
The kinds of stamps in each of these packages are shown in
Table 15.22. a) Determine the d.n.f. of f, whose value is 1 for abce
Determine all minimal combinations of packages Mona can prime, and 0 otherwise.
buy so that Jason will get some stamps from all four geograph- b) Draw the two-level gating network for f as a minimal
ical locations. sum of products.
Table 15.22 ¢) We are informed that the given network is part of a larger
network and that, as a result, the binary equivalents of the
United States | European | Asian | African numbers 10 through 15 are never provided as input. Design
a two-level gating network for f under these circumstances.
u J v
v v v
Prime
w v v b
esl |
number
d detector
|-——» f
x J Figure 15.12
y J Jv
5. Determine all minimal dominating sets for the graph G
z J J shown in Fig. 15.13.
2. Rework Example 15.20 using a Karnaugh map on six
variables.
mee
3. Find a minimal-sum-of-products representation for
a) f(w, x,y,z) =) mC, 3,5,7,9) +
d(10, 11, 12, 13, 14, 15) f g
b) f(w, x, y,z) = > m(O, 5, 6, 8, 13, 14) + d4, 9, 11) Figure 15.13
15.4
The Structure of a Boolean
Algebra (Optional)
In this last section we analyze the structure of a Boolean algebra and determine those
m € Z* for which there is a Boolean algebra of m elements.
Definition 15.5 Let & be a nonempty set that contains two special elements 0 (the zero element) and 1 (the
unity, or one, element) and on which we define closed binary operations +, -, and a monary
(or unary) operation ~. Then (B, +, -, ~, 0, 1) is called a Boolean algebra if the following
conditions are satisfied for all x, y, z € B.
734 Chapter 15 Boolean Algebra and Switching Functions
a) x+y=ytx a) xy=yx Commutative Laws
b) xQv+z) =xy4+xz by x+yz=(*+ y)(x +2) Distributive Laws
c) x+0=x cy xl=x-l=x Identity Laws
d) x+x=1 dy xx =x-x=0 Inverse Laws
e) OF 1
As seen in Definition 15.5, we often write x y for x - y. When the operations and identity
elements are known, we write & instead of (8, +, -, ~, 0, 1).
From our past experience we have the following examples.
Tf U is a (finite) set, then B= PCU) is a Boolean algebra where for A, B CU, we have
EXAMPLE 15.23
A+B=AUB,AB=AQNB, A =the complement of A (in U), and where 9 is the zero
element and “Ul is the unity.
Forn € Z*, F, = {f: B" — B}, the set of Boolean functions on n Boolean variables, is a
EXAMPLE 15.24
Boolean algebra where +, -, and ~~ are as defined in Definition 15.2, and where the zero
element is the constant function 0, while the constant function 1 is the one element.
Let us now examine a new type of Boolean algebra.
Let & be the set of all positive integer divisors of 30: B = {1, 2, 3, 5, 6, 10, 15, 30}. For
EXAMPLE 15.25
all x, y € B, define x + y = Iem(x, y); xy = ged(x, y); and x = 30/x. Then with 1 as
the zero element and 30 as the unity element, one can verify that (B,+,-+,~, 1, 30) isa
Boolean algebra. We shall establish one of the distributive laws for this Boolean algebra
and leave the other conditions for the reader to check.
For the first distributive law we want to show that
gcd(x, Ilem(y, z)) = lem(ged(x, y), ged(x, z)),
for all x, y, z € B. In order to do so we write
x= Di gk2 5h y= Qi gma sms and z= QBN SMa
where 0 < k;, m;,n; < 1 forall 1 <i <3.
Then Icm(y, z) = 2°'3°5°, where s; = max{m;,n;}, for all 1<i<3, and so
gcd(x, lem(y, z)) = 2"375°, where ¢; = min{k;, max{m;,n;}}, for all 1 <i <3. Also,
ged(x, y) = 2%'325’8, where u; = min{k;, m;}, when 1 <i <3,and gcd(x, z) = 2"!3%5"%
with v; = min{k;, 2;} for all | <i <3. So lem(ged(x, y), gcd(x, z)) = 2%'3"5", where
w; = max{u;, v;}, for all 1 <i <3.
Therefore, for each 7 € {1, 2, 3}, w; = max{u;, v,} = max{min{k;, m;}, min{k;, n;}},
and ft, = min{k,, max{m,, 1,}}. To verify the result, we need to show that w; = ¢; for all
1<i<3.Ifk; =0,thenw; = 0 =1¢,.Ifk; = 1, then w; = max{m,;, n;} = t;. This exhausts
all possibilities, so w, = ft; for 1 <i <3 and
gcd(x, Iem(y, z)) = Iem(ged(x, y), gcd(x, z)).
If we analyze this result further, we find that 30 can be replaced by any number m =
P1p2p3, where p,, p2, p3 are distinct primes. In fact, the result follows for the set of all
divisors of p1 p2-- + Pn», a product of n distinct primes. (Note that such a product is square-
free; that is, there is no k € Z*, k > 1, with k* dividing it.)
15.4 The Structure of a Boolean Algebra ,Optional) 735
A word about the propositional calculus. If p, g are two primitive propositions, we may
EXAMPLE 15.26
fee] that the collection of all propositions obtained from p, g, using V, A, and ~, should be
a Boolean algebra. After all, just look at the laws of logic and the way they compare with
the comparable results for set theory and Boolean functions. There is one main difference.
In our study of logic we found, for example, that p A g <> q A p, not that pA g =g A p.
To get around this we define a relation ‘8 on the set S of all propositions so obtained from
p,q, where 5; KR sz if s; <> sy. Then & is an equivalence relation on S and partitions S, in
this case, into 16 equivalence classes. If we define +, -, and on these equivalence classes
by [s,] + [s2] = [81 V sa], [51 ][s2] = [s) A so], and [s,] = [—s,], and if we recognize [Ty]
as the one element and [ Fy] as the zero element, then we get a Boolean algebra.
In the definition of a Boolean algebra, there are nine conditions. Yet in the lists of
properties we examined for set theory, logic, and Boolean functions, we listed 19 properties.
And there were even more! Undoubtedly, there is a way to get the remaining properties,
and others not listed among the 19, from the ones given in the definition.
THEOREM 15.1 The Idempotent Laws. For all x € %, a Boolean algebra, (i) x + x = x; and (ii) xx = x.
Proof: (To the right of each equality appearing in this proof, we list the letter of the condi-
tion from Definition 15.5 that justifies it.)
i) x =x+0 c) ii) x =x-1 c)’
=x+xx dy =x(x +x) d)
= (x + x)(x +X) by’ = XX + XX b)
=(x+x)-1 d) =xx+0 dy’
=x+x cy = xx c)
In proving this theorem we can obtain the proof of part (ii) from that of part (i) by
changing all occurrences of + to -, and vice versa, and all occurrences of 0 to 1, and vice
versa. Also, the justifications for the corresponding steps constitute a pair of conditions in
Definition 15.5. As in the past, these pairs are said to be duals of each other; condition (e)
is called self-dual. This now leads us to the following result.
THEOREM 15.2 The Principle of Duality. If s is a theorem about a Boolean algebra, and s can be proved
from the conditions in Definition 15.5 and properties derived from these same conditions,
then its dual s@ is likewise a theorem.
Proof: Let s be such a theorem. Dualizing all the steps and reasons in the proof of s (as in
the proof of Theorem 15.1), we obtain a proof for s“.
We now list some further properties for a Boolean algebra. We shall prove some of these
properties and leave the remaining proofs for the reader.
THEOREM 15.3 For every Boolean algebra %, if x, y, z € B, then
a) x -0=0 ay x+1=1 Dominance Laws
b) xa+y)=x by x+xy=x Absorption Laws
736 Chapter 15 Boolean Algebra and Switching Functions
Cc) [xy =xzandxy=xz]>y=2z Cancellation Laws
cy’ [Ix t+ty=x+zand¥+y=x4+z])>
y=z
d) x(yz) = (xy)z qd’ x+(yt+z=(+y)+2z Associative Laws
e) [x+y=landxy=0) > y=x Uniqueness of Complements
(Inverses)
f) X =X Law of the Double
Complement
g) xy=x+y gy xty=xy DeMorgan’s Laws
h) 0=1 hy 1=0
i) xy =Oifandonlyif i)) x+y =1 if and only if
Xy=x xXx+y=X
Proof:
a) x-0=04x-0, by Definition 15.5(c), (a)
=x:-x¥+x-0, by Definition 15.5(d)’
=x-(x +0), by Definition 15.5(b)
=X-X, by Definition 15.5(c)
= 0, by Definition 15.5(d)’
a)’ This follows from part (a) by the Principle of Duality.
c) Here y=1-y=(x+xX)y =xytxy =xztxz=(x+x)2 =1-2=2. (Verify
all equalities.)
cy This is the dual of part (c).
d) To establish this result, we use result (c)’ and arrive at the conclusion by showing
that x + [x(yz)] =x + [(xy)z] and X¥ + [x(yz)] = xX + [(xy)z]. Using the absorp-
tion law, we find that x + [x(yz)] = x. Likewisex + [(xy)z] = [x + (xy)](x +2) =
x(x +z) =x. Then ¥ + [x(yz)] = @+x)@4+ yz)=1->@+ yz) =x 4+ yz,
whereas x + [(xy)z] = (X + xy)\@+z) = (H+AXX + y)H4+2) =
(1: + y)@4+2)=@+y)\% +z) =X +4 yz. (Verify all equalities.)
The result now follows by the cancellation law in part (c)’.
d) Fortunately, this is the dual of part (d).
~
e) We find here that®¥ =X +0=X+xy=(¥+x)¥+y)=1-@+4+y)=
X+y)-l=(®#+yet+y)=xXx+y =04+ y = y. (Verify all equalities.)
We note that statement (e) is self-dual. Statement (f) is a corollary of (e) because
x and x are both complements (inverses) of xX.
g) This result will follow from part (e) if we can show that x + y is a complement
of xy.
xy+(%+y)=(Cytx)+y=%4+xX)(yt+x)t+y
=1-(yt¥)+
Y= Fy) FF =14+F51
Also, xy(¥ + ¥) = (vyx) + (4yy) = (@X)y) + @(Y)) = O-y tx -0=
0+0=0.
Consequently, x + y is a complement of xy, and by uniqueness of complements,
it follows that xy = x + y.
Enough proving for a while! Now we are going to investigate how to impose an order on
the elements of a Boolean algebra. In fact, we shall want a partial order, and for this reason
we turn now to the Hasse diagram.
15.4 The Structure of a Boolean Algebra (Optional) 737
Let us start by considering the Hasse diagrams for the following two Boolean algebras.
{1,2,3} =U
{1,3} a) (PU), ULM, 7, 8, WU), where U = {1, 2, 3}, and the partial order is induced by the
{1,2} {2, 3} subset relation.
b) (Ff, +, -., 1, 30), where & = {1, 2, 3,5, 6, 10, 15, 30}, x + y = Iem(x, y), xy =
gced(x, y), and x = 30/x. Hence the zero element is the divisor 1 and the one element
{1} {3}
is the divisor 30. The relation R on F, defined by x R y if x divides y, makes ¥ into
a poset.
(a) { }=G
Figure 15.14 shows the Hasse diagrams for these two Boolean algebras. Ignoring the
30
labels at the vertices in each diagram, we see that the underlying structures are the same.
This suggests how we should define the concept of isomorphism for Boolean algebras.
KAY These examples also suggest two other ideas.
“ : 1) ‘Can we partially order any finite Boolean algebra?
2) Looking at Fig. 15.14(a), we see that the nonzero elements just above 4 are such
that every element other than 4 can be obtained as a Boolean sum of these three. For
(b) example, {1, 3} = {1} U {3} and {1, 2, 3} = {1} U {2} U {3}. For part (b), the numbers
Figure 15.14 2, 3, and 5 are such that every divisor other than 1 is realized as the Boolean sum of
these three. For example, 6 = Icm(2, 3) and 30 = Iem(2, Iem(3, 5)) = Iem(2, 3, 5).
We now start to deal formally with these suggestions.
When dealing with sets in Chapter 3 we related the operations of U, , and ~ to the sub-
set relation by the equivalence of the statements: (a) A C Bs (b)AN B=A;(c)AUB=B;
and (d) B C A, where A, B © Ul. We now use parts (a) and (b) in an attempt to partially
order any Boolean algebra %.
Definition 15.6 Ifx, y € RB, definex < yifxy =x.
Hence we define a new concept — namely, “<” — in terms of notions we have in & —
namely, - and the notion of equality. We can make up definitions! But does this one lead us
anywhere?
THEOREM 15.4 The relation “<”, just defined, is a partial order.
Proof: Since xx = x for all x € &%, we have x < x and the relation is reflexive. To establish
antisymmetry, suppose that x, y € B with x < y and y < x. Then xy = x and yx = y. By
the commutative property, xy = yx,sox = y. Finally, ifx < y and y < z, thenxy = x and
yZ = y, sox = xy = x(yz) = (wy)z = xz, and with x = xz, we have x < z, so the relation
is transitive.
Now we can partially order any Boolean algebra, and we note that for all x in a Boolean
algebra, 0 < x and x < 1. (Why?) Before going on, however, let us consider the Boolean
algebra consisting of the divisors of 30. How do we apply Theorem 15.4 in this example?
Here the partial order is given by x < yifxy = x. Since xy is gcd(x, y), if ged(x, y) = x,
then x divides y. But this was precisely the partial order we had on this Boolean algebra
when we started.
738 Chapter 15 Boolean Algebra and Switching Functions
Armed with this concept of partial order, we return to the observations we made earlier
about the elements in the Hasse diagrams of Fig. 15.14.
Definition 15.7 Let 0 denote the zero element of a Boolean algebra &%. An element x € %, x # 0, is called
an atom of & if for all ye BZ, y< x >y=Oory =x.
a) For the Boolean algebra of all subsets of U = {1, 2, 3}, the atoms are {1}, {2}, and {3}.
EXAMPLE 15.27
b) When we are dealing with the positive integer divisors of 30, the atoms of this Bool-
ean algebra are 2, 3, and 5.
c) The atoms in the Boolean algebra F,, = {f: B” — B} are the minterms (or fundamen-
tal conjunctions).
The atoms of a finite Boolean algebra satisfy the following properties.
THEOREM 15.5 a) Ifx is an atom in a Boolean algebra &, then for all y € B, xy =Oorxy =x.
b) If x;, x2 are atoms of & and x, F x2, then x; x2 = 0.
Proof:
a) For all x, y € B, xy <x, because (xy)x = x(yx) = x(xy) = (xx)y = xy. For x an
atom, xy <x => xy =Oorxy =x.
b) This follows from part (a). The reader should supply the details.
THEOREM 15.6 If x), X2,..., X, are all the atoms in a finite Boolean algebra &B and x € B with xx; =0
for all 1 <i <n, thenx = 0.
Proof: Ifx # 0, let S = {y © Bl|O< y <x}. (0 < y denotes 0 < y andO ¥ y.) Withx € S,
we have S # @. Since S is finite, we can find an element z in %& where 0 < z <x and no
element of % is between 0 and z. Then z is an atom and 0 = xz = z > 0. This possibility
has led us to a contradiction, so we cannot have x # 0; that is, x = 0.
This leads us to the following result on representation.
THEOREM 15.7 Given a finite Boolean algebra &% with atoms x1, x2, ..., X,, each x € RB, x #0, can be
written as a sum of atoms uniquely, up to order.
Proof: Since x # 0, by Theorem 15.6, S = {x;|xx; 4 0} AY. Let S = (4;,, %,,..., Xi},
and y = x;, + xj, +---+4;,. Then xy = x(xj, +24), t+ > $4i,) = 4X, HX, + +
XXi, = Xi, + Xi. +--+ + Xi,, by Theorem 15.5(a). So xy = y.
Now consider (xy)x; for each | <i <n. If x; ¢ S, then xx; = 0, and (xy)x; = 0. For
x; ES, we have (xy)xj; = xx; (Xj, + Xi, $+ +i) = XXX, + Xi) = XOX),
where z is the product of the complements of all elements in S — {x;}. As x;x, =0, it
follows that (xy)x; = 0. So (xy)x; = 0 for all x;, where | <i <n. By Theorem 15.6,
we have xy = 0.
With xy = y and xy = 0, it follows thatx =x-1l=x(y+y)=xy+xy =xy+0=
y=XxXj, +X, +++: +X;,,a Sum of atoms.
15.4 The Structure of a Boolean Algebra (Optional) 739
To show that this representation for x is unique, up to order, suppose x = x;, +
Xj bor +Xjp.
If x;, does not appear as a summand in x;, +x, +:+-+%;,, then x; =x), x, =
Xj, (Xj, +X, +--+ +x,,) [by Theorem 15.5(b)] = xj,.x = x), (4i, $4), +:+° + 4,) =O
[again, by Theorem 15.5(b)]. Hence x ;, must appear as a summand in x;, + x;, ++ ++ +24,
as must x;,,..., X;,.So0& <k. By the same reasoning, we get k < ¢ and find the represen-
tations identical, except for order.
From this result we see that if & is a finite Boolean algebra with atoms x,, x2, ..., Xn,
then each x € & can be uniquely written as }))_, c;x;, with each c; € {0, 1} (and because B
is closed under +, each such linear combination of atoms is in %). If c; = 0, this indicates
that x; is not in the representation of x; c; = 1 indicates that it is. Consequently, each x € B
is associated with a unique n-tuple (c), c2,..., C,), and there are 2” such n-tuples. There-
fore we have proved the following result.
THEOREM 15.8 If & is a finite Boolean algebra with n atoms, then |{B| = 2”.
There is one final question to resolve. If n € Z*, how many different Boolean algebras
of size 2” are there? Looking at the Hasse diagrams in Fig. 15.14, we see two different
pictures. But if we ignore the labels on the vertices, the underlying structures emerge as
exactly the same. Hence these two Boolean algebras are said to be abstractly identical or
isomorphic.
Definition 15.8 Suppose (B,, +, -, ~, 0, 1) and (B2, +, -, , 0, 1) are Boolean algebras. Then %, &B> are
called isomorphic if there is a function f: 2%, — &> such that f is one-to-one and onto,
and for all x;, yi € By,
a) fixity) = fand+ fon
t t
{in 23,) {in B2)
b) f(4i- yi) = f(x): fo)
t t
(in By) {in 22)
c) f(x) = f(x) [In f ,) we take the complement in %,, while for f (x,) the comple-
ment is taken in B>.]
Such a function f preserves the operations of the algebraic structures.
EXAMPLE 15.28 For the two Boolean algebras in Fig. 15.14, define f by
f:@->1 f: 2} 3 f: {1,2} > 6 f: {2,3}
15
f:{l}>2 f:{3}—-5 f:{1, 3} > 10 f:{1,
2, 3} > 30
Note the following:
a) The zero elements correspond under f, as do the unity elements.
b) FALE U {2)) = FCI, 2b) = 6 = Iem(2, 3) = Iem(
fF ({1}), F2))
740 Chapter 15 Boolean Algebra and Switching Functions
ce) f({1, 2} {2, 3}) = F({2}) = 3 = ged(6, 15) = ged(
Ff ({1, 2}), F({2, 3}))
d) f({2}) = f({1, 3}) = 10 = 30/3 = 3 = F({2})
e) The image of each atom ({1}, {2}, {3}) is an atom (2, 3, 5, respectively).
This function is an isomorphism. Once we establish a correspondence between the re-
spective zero elements and between the respective atoms, the remaining correspondences
are determined from these by Theorem 15.7 and the preservation of the operations under f.
From this example we have our final result.
THEOREM 15.9 Every finite Boolean algebra & is isomorphic to a Boolean algebra of sets.
Proof: Since & is finite, & has n atoms x;, 1 <i <n, and |B] = 2”. Let U = {1, 2,..., 7}
and 9 (°tL) be the Boolean algebra of subsets of U.
We define f: B ~ PAU) as follows. For each x € &, it follows from Theorem 15.7 that
we can write x = yr c;X;, where each c; is 0 or 1. [Here c; € {0, 1} (= B) and for each
atom a in &%, c;a = 0 (the zero element in &) if c, = 0, while c;a = a when c, = 1.] Then
we define
f(x) ={ij/l<i<n and c, =1}.
[For example, (1) f(0)=%; @) f(«%;) = {i} for each atom x;, where 1 <i <n;
(3) f(x; + x2) = {1, 2}; and (4) f(x2 + x4 +.x7) = {2, 4, 7}.] Now consider x, y € B,
with x = rey c)x; and y = yore d;x,, where c;, d; € {O, 1} for all 1 <i <n.
First we find thatx + y = yr s;x,, where s; = c; + d; foreach 1 <i <n. (Remember
that here 1 + 1 = 1.) Consequently,
fixty)={i[l <i <nands; = 1}
= {i{1 <i
<n and (c; = | ord; = 1)}
= {ijl <i<nandc, = 1}U {i|1<i<nandd; = 1}
= f(x) U f(y).
Theorem 15.5(b) tells us that
n
Xoy= 5 E;X;,
i=l
where t; = c;d; for all 1 <i <x, and so, in a similar way, we get
f(x -y) = {iJl <i <nand 4 = 1}
= {i]1 <i <n and (c; = 1 andd, = 1)}
= {i|1<i<nandc; = 1}N{i|1 <i <n andd; = 1}
= fF).
To complete the proof that f is an isomorphism, we should first observe that if x =
SoP_, cixj, then = )°"_, ;x;. This follows from Theorems 15.3(e) and 15.5(b) because
n n n n
> CjX; + Yo Gx = Sci + C;) x; = Yo xi =]
1=1 i=l i=] i=l
15.4. The Structure of a Boolean Algebra (Optional) 741
(Why is this true? See Exercise 15 for this section.) and
(> os] (> aa = x Ci Ci X; = 3 Ox; = 0,
i=] i=] i=] i=]
Now we find that
f() = {ijl <i <nande; = 1}
= {ijl <i<nandc;
=0}
= {i]l<i<nandc; = 1}
= f(x),
so the function f preserves the operations in the Boolean algebras & and PU).
We leave to the reader the details showing that f is one-to-one and onto.
10. If & is a Boolean algebra, prove that the zero element and
93 eh Ae Le | the one element of 9 are unique.
1. Verify the second distributive law and the identity and in- 11. Let f: 8, — B, be an isomorphism of Boolean algebras.
verse laws for Example 15.25. Prove each of the following:
2. Complete the proof of Theorem 15.3.
a) f(0) =0. b) f()
= 1.
3. Let % be the set of positive integer divisors of 210, and
c) Ifx, y € B, withx < y, then in Bs, f(x) < fy).
define +,-, and” for B by x + y = Iem(x, y), x-y=xy=
gcd(x, y), and x = 210/x. Determine each of the following: d) Ifx is an atom of &%,, then f(x) is an atom in Bp.
a) 30+5-7 b) (04+5)- (3047) 12. Let &%, be the Boolean algebra of all positive integer di-
visors of 2310, with 8 the Boolean algebra of all subsets of
c) (144+ 15) d) 21(2+ 10)
{a, b, c, d, e}.
e) (24+3)4+5 f) (6+ 35)(7 + 10) a) Define f:%, > B2 so that f(2) = {a}, f(3) = {3},
4. For a Boolean algebra & the relation “<” on &, defined f(5) = te}, £7) = {d}, FCI) = {fe}. For f to be an
by x < y if xy = x, was shown to be a partial order. Prove isomorphism, what must the images of 35, 110, 210, and
that: (a) ifx < y thenx + y = y; and (b)ifx < y theny <x. 330 be?
5. Let (B, +, +, , 0, 1) be a Boolean algebra that is partially b) How many different isomorphisms can one define be-
ordered by <. tween %, and RB»?
a) Ifw € Band w <0, prove that w = 0. 13. a) If B,, B. are Boolean algebras and f:%, > B®, is
b) Ifx € Band | <x, prove thatx = 1. one-to-one, onto, and such that f(x + y) = f(x) + fO)
c) If y, z € B with y < z and y < Z, prove that y = 0. and f(x) = f(x), forall x, y € Bj, prove that f is an iso-
morphism.
6. Let (B, +, -, ~, 0, 1) be a Boolean algebra that is partially
ordered by <. If w, x, y, z € B with w <x and y <z, prove b) State and prove another result comparable to that in
that (a) wy < xy; and(b)w+y<x+4+2z. part (a). (What principle is used here?)
7. If B is a Boolean algebra, partially ordered by <, and 14, Prove that the function f in Theorem 15.9 is one-to-one
x, y € B, what is the dual of the statement “x < y’? and onto.
8. Let F, = {f: B" > B} be the Boolean algebra of all
15. Let & be a finite Boolean algebra with the n atoms x,
Boolean functions on n Boolean variables. How many atoms
X2,...,X,. (So |B = 2”.) Prove that
does F,, have?
9, Verify Theorem 15.5(b). P= x, +x. +---+%,.
742 Chapter 15 Boolean Algebra and Switching Functions
15.5
Summary and Historical Review
The modern concept of abstract algebra was developed by George Boole in his study
of genera! abstract systems, as opposed to particular examples of such systems. In his
1854 publication An Investigation of the Laws of Thought, he formulated the mathematical
structure now called a Boolean algebra. Although abstract in nature during the nineteenth
century, the study of Boolean algebra was investigated in the twentieth century for its
applicative value.
Starting in 1938, Claude Elwood Shannon (1916-2001) made the first major contribution
in applied Boolean algebra in [8]. He devised the algebra of switching circuits and showed
its relation to the algebra of logic. Additional developments that were made in this area
during the 1940s and 1950s are noted in the paper by C. E. Shannon [9] and in the report of
the Harvard University Computation Laboratory [10]. (The computer term bit was coined
by Claude E. Shannon, who was also one of the first to represent information in terms of
bits.)
Claude Elwood Shannon (1916-2001)
We found that switching functions can be represented by their disjunctive and conjunctive
normal forms. These forms allowed us to write such functions in a compact way using binary
labels. The minimization process showed us how to represent a given Boolean function as
a minimal sum of products or as a minimal product of sums. Based on the map method by
E. W. Veitch [11], Maurice Karnaugh’s modification [4] was developed here as a pictorial
method for the simplification of Boolean functions. Another technique that we mentioned
in the text is the tabulation algorithm known as the Quine-McCluskey method. Originally
developed by Willard Van Orman Quine (1908-2000) [6, 7], this technique was modified
by Edward J. McCluskey, Jr. (1929- ) [5]. It is very useful for functions with more than
six variables and lends itself to computer implementation. The interested reader can find
more about Karnaugh maps in Chapter 6 of F. J. Hill and G. R. Peterson [3]. Chapter7
of [3] provides an excellent coverage of the Quine-McCluskey method. J. F Wakerly [12]
examines digital circuits in the light of contemporary technology, whereas T. L. Booth [1]
investigates some specific applications of logic design in the study of computers. A more
Supplementary Exercises 743
advanced coverage of the applications given in this chapter (along with many other related
concepts) is given in the text by K. G. Gopolan [2].
Although the major part of this chapter was applied in nature, Section 15.4 found us
investigating the structure of a Boolean algebra. Unlike commutative rings with unity,
which come in all possible sizes, we found that a Boolean algebra can contain only 2”
elements, where n € Z*+. Uniqueness of representation appeared as we found the atoms
of a Boolean algebra used to build the rest of the algebra (except for the zero element).
The Boolean algebra of sets that we studied in Chapter 3 was found to represent all finite
Boolean algebras in the sense that a finite Boolean algebra with n atoms is isomorphic to
the Boolean algebra of all subsets of {1, 2, 3,..., n}.
REFERENCES
1, Booth, Taylor L. Digital Networks and Computer Systems, 2nd ed. New York: Wiley, 1978.
2 . Gopolan, K. Gopal. /ntroduction to Digital Microelectronic Circuits. Chicago: Irwin, 1996.
3. Hill, Frederick J., and Peterson, Gerald R. Introduction to Switching Theory and Logical
Design, 3rd ed. New York: Wiley, 1981.
. Karnaugh, Maurice. “The Map Method for Synthesis of Combinational Logic Circuits.” Trans-
actions of the AIEE, part I, vol. 72, no. 9 (1953): pp. 593-599.
. McCluskey, Edward J., Jr. “Minimization of Boolean Functions.” Bell System Technical Jour-
nal 35, no. 6 (November 1956): pp. 1417-1444.
. Quine, Willard V. “The Problem of Simplifying Truth Functions.” American Mathematical
Monthly 59, no. 8 (October 1952): pp. 521-531.
. Quine, Willard V. “A Way to Simplify Truth Functions.” American Mathematical Monthly 62,
no. 9 (November 1955): pp. 627-631.
. Shannon, Claude E. “A Symbolic Analysis of Relay and Switching Circuits.” Transactions of
the AIEE, vol. 57 (1938): pp: 713-723.
. Shannon, Claude E. “The Synthesis of Two-terminal Switching Circuits.” Bell System Tech-
nical Journal, vol. 28 (1949): pp. 59-98.
10. Staff of the Computation Laboratory. Synthesis of Electronic Computing and Control Circuits,
Annals 27. Cambridge, Mass.: Harvard University Press, 1951.
ll. Veitch, E. W. “A Chart Method for Simplifying Truth Functions.” Proceedings of the ACM.
Pittsburgh, Penn. (May 1952): pp. 127-133.
12. Wakerly, John F. Digital Design: Principles and Practices, 2nd ed. Englewood Cliffs, N.J.:
Prentice-Hall, 1994,
b) If Kathleen is invited, Nettie and Margaret must also be
SUPPLEMENTARY EXERCISES invited.
c) She can invite Cathy or Joan, but definitely not both of
them.
1. Let n > 2. If x, is a Boolean variable for all 1 <i <a,
prove that d) Neither Cathy nor Nettie will show up if the other is not
invited.
a) (41 + 2X2 +++ + Xn) = X1XQ+ + Xp
_ e) Either Kathleen or Nettie or both must be invited.
b) (1X2 +++ Xn) =X +X. +-- -+% .
" Determine which subsets of these five friends Eileen can
2. Let f, g:B°— B be Boolean functions, where f = invite to her party and still satisfy conditions (a) through (e).
>. m(1, 2,4, 7, x) and g = - m(O, 1, 2, 3, y, z, 16, 25). If
4. Let f, g: Bt > B, where f = }° m(2, 4, 6, 8), and
f < g, what are x, y, z?
g= > m(1, 2, 3,4, 5, 6, 7, 8, 9, 11, 13, 15). Find
a function
3. Eileen is having a party and finds herself confronted with h: B* —> B such that f = gh.
decisions about inviting five of her friends. 5. Let & bea Boolean algebra that is partially ordered by <. If
a) If she invites Margaret, she must also invite Joan. x, y, z € B, prove thatx + y < zifandonlyifx <zandy < z.
744 Chapter 15 Boolean Algebra and Switching Functions
6. State and prove the dual of the result in Exercise 5. c) If this network is part of a larger network and, conse-
quently, the binary equivalents of the numbers 10 through
7. Let & be a Boolean algebra that is partially ordered by <. 15 never occur as inputs, design a two-level gating network
For all x, y € & prove that for g in this case.
a) x < yif and only if x + y = 1; and 11. For n Boolean variables there are 27” Boolean functions,
b) x < y if and only if xy = 0. each of which can be represented by a function table.
8. Let x, y be elements in the Boolean algebra &%. Prove that a) ABoolean function f on the n variables x), x2, ..., Xn
x = yif and only ifxy+xy = 0. is called self-dual if
9. Use a Karnaugh map to find a minimal-sum-of products Fm, N2, 2-5, Xn) = FO, X2, tay Xn).
representation for How many Boolean functions on ” variables are self-dual?
a) f(w, x,y,z) = >> m(0, 1, 2, 3, 6, 7, 14, 15) b) Let f: B’ > B. Then f is called symmetric if
b) g(v, w, x,y,z) =1] MA, 2, 4, 6, 9, 10, 11, 14, 17, fO,y. 2) = FO, 2, y) = FO, X, 2)
18, 19, 20, 22, 25, 26, 27, 30)
= f(y, 2, x) = f(x,y) = f(z, y, x).
10. The four input lines for the network in Fig. 15.15 provide
the binary equivalents of the numbers 0, 1, 2,..., 15, where So the value of f is unchanged when we rearrange the three
each number is represented as abce, with e the least significant columns of values listed under x, y, and z in the table for
f. How many such functions are there on three Boolean
bit.
variables? How many are there on nz Boolean variables?
@a—| 12. Let &, be the Boolean algebra of all positive integer divi-
Power
sors of 30030, and let 32 be the Boolean algebra of all subsets
bc——»> x. f t Wo |+——_> 9g
e »| detector of {u, v, w, x, y, z}. How many isomorphisms f: 8, > B
satisfy f(2) = {u} and f(6) = {u, v}?
Figure 15.15 13. For (a) n = 60, and (b) n = 120, explain why the posi-
tive integer divisors of n do not yield a Boolean algebra. (Here
a) Find the d.n-f. of g, whose value is 1 exactly when abce x+y=Icm(, y), xy = ged(x, y), x =n/x, | is the zero el-
is the binary equivalent of 1, 2, 4, or 8. ement, and n is the one element.)
b) Draw the two-level gating network for g as a minimal 14. Let a, b, c€ B, a Boolean algebra. Prove that ab +c =
sum of products. a(b + c) if and only ifc <a.
16
Groups, Coding
Theory, and Polya’s
Method of
Enumeration
I: our study of algebraic structures we examine properties shared by particular mathemat-
ical systems. Then we generalize our findings in order to study the underlying structure
common to these particular examples.
In Chapter 14 we did this with the ring structure, which depended on two closed binary
operations. Now we turn to a structure involving one closed binary operation. This structure
is called a group.
Our study of groups will examine many ideas comparable to those for rings. However,
here we shall dwell primarily on those aspects of the structure that are needed for applications
in cryptology, coding theory, and a counting method developed by George Polya.
16.1
Definition, Examples,
and Elementary Properties
Definition 16.1 If G is anonempty set and o is a binary operation on G, then (G, ©) is called a group if the
following conditions are satisfied.
1) For all a, b€ G,a ob €G. (Closure of G under o)
2) For alla, b, c€ G,ao (boc) = (aob) oc. (The Associative Property)
3) There exists e € G with aoe =e0a =a, for all ae G. (The Existence of an
Identity)
4) For each a € G there is an element b € G such that a ob = boa = e. (Existence of
Inverses)
Furthermore, ifa ob = boa foralla, b € G, then G is called a commutative, or abelian,
group. The adjective abelian honors the Norwegian mathematician Niels Henrik Abel
(1802-1829).
745
746 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
We realize that the first condition in Definition 16.1 could have been omitted if we simply
required the binary operation for G to be a closed binary operation.
Following Definition 14.1 (for a ring) we mentioned how the associative laws for the
closed binary operations of + (ring addition) and - (ring multiplication) could be extended
by mathematical induction. The same type of situation arises for groups. If (G, 0) is any
group, and r,n € Z* withn > 3 and 1 <r <n, then
(@) 042 0--:0d;,) 0 (G;41 O° ++ OA_) =A) 0020-++ 04, O44, 0+°+ OA,
where a), do,..., Gy, Gy41,--., Gy are all elements from G.
Under ordinary addition, each of Z, Q, R, C is an abelian group. None of these is a group
EXAMPLE 16.1
under multiplication because 0 has no multiplicative inverse. However, Q*, R*, and C*
(the nonzero elements of Q, R, and C, respectively) are abelian groups under ordinary
multiplication.
If (R, +, +) is a ring, then (R, +) is an abelian group; the nonzero elements of a field
(F, +, -) form the abelian group (F%, -).
For n € Z*,n > 1, we find that (Z,, +) is an abelian group. When p is a prime, (Z*, -) is
EXAMPLE 16.2
an abelian group. Tables 16.1 and 16.2 demonstrate this for n = 6 and p = 7, respectively.
(Recall that in Z, we often write a for [a] = {a + kn|k € Z}. The same notation is used
in Z*.)
P
Table 16.1 Table 16.2
+ 0 1 2 3 4 5 1 2 3 4 5 6
0 0 1 2 3 4 5 1 1 2 3 4 5 6
1 1 2 3 4 5 0 2 2 4 6 1 3 5
2 2 3 4 5 0 1 3 3 6 2 5 ] 4
3 3 4 5 0 1 2 4 4 1 5 2 6 3
4 4 5 0 1 2 3 5 5 3 1 6 4 2
5 5 0 1 2 3 4 6 6 5 4 3 2 1
Definition 16.2 For every group G the number of elements in G is called the order of G and this is denoted
by |G|. When the number of elements in a group is not finite we say that G has infinite
order.
EXAMPLE 16.3 | For all n € Z*, |(Z,, +)| =n, while |(Zi. -)| = p — 1 for each prime p.
EXAMPLE 16.4 | Let us start with the ring (Zo, +, -) and consider the subset
in Zo} = {a € Zo| a~'exists} = {1, 2, 4,5, 7, 8} ={aeZ*|l<a<
Uy = {a € Zo| a is a unit
8 and gced(a, 9) = f}.
The results in Table 16.3 show us that Uo is closed under the multiplication for the ring
(Zo, +, +) —namely, multiplication modulo 9. Furthermore, we also see that 1 is the iden-
tity element and that each element has an inverse (in Ug). For instance, 5 is the inverse for
2, and 7 is the inverse for 4. Finally, since every ring is associative under the operation
16.1 Definition, Examples, and Elementary Properties 747
of (ring) multiplication, it follows that a - (b+ c) = (a+ b)-c for all a, b, c € Ug. Conse-
quently, (Uo, -) is a group of order 6 — in fact, it is an abelian group of order 6.
Table 16.3
1 2 4 5 7 8
l ] 2 4 =5 7 8
2 2 4 8 1 5 7
4/4 8 7F 2 1 5
5 5 1 2 7 8 4
7 7 5 1 8 4 2
8 8 7F 5 4 2 1
In general, for each n € Z*, where n> 1, if U, = {a €(Z,, +, +)| @ is a unit} =
fae Zt|\1<a<n-—1 and ged(a,n) = 1}, then (U,, -) is an abelian group under the
(closed) binary operation of multiplication modulo n. The group (U;,, +) is called the group
of units for the ring (Z,,, +, -) and it has order #(n), where ¢ denotes the Euler phi function
of Section 8.1.
From here on the group operation will be written multiplicatively, unless it is given
otherwise. So a o b now becomes ab.
The following theorem provides several properties shared by all groups.
THEOREM 16.1 For every group G,
a) the identity of G is unique.
b) the inverse of each element of G is unique.
c) ifa, b, c € G and ab = ac, then b = c. (Left-cancellation property)
d) if a, b, c € G and ba = ca, then b = c. (Right-cancellation property)
Proof:
a) If e;, e2 are both identities in G, then e; = e,e2 = ep. (Justify each equality.)
b) Let a € G and suppose that b, c are both inverses of a. Then b = be = b(ac) =
(ba)c = ec = c. (Justify each equality.)
The proofs of properties (c) and (d) are left for the reader. (It is because of these properties
that we find each group element appearing exactly once in each row and each column of
the table for a finite group.)
On the basis of the result in Theorem 16.1(b) the unique inverse of a will be designated
by a~'. When the group is written additively, —a is used to denote the (additive) inverse
of a.
As in the case of multiplication in a ring, we have powers of elements in a group. We
define a® = e, a! = a, a* =a-a, and in general a"t! = q” -a, for all n EN. Since each
group element has an inverse, for n € Z*, we define a~” = (a~')". Then a” is defined for
all n € Z, and it can be shown that for all m,n € Z,a™ - a" = a’ *" and (a’")" =a".
748 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
If the group operation is addition, then multiples replace powers and for all m,n € Z,
and all a € G, we find that
ma+na=(m+n)a m(na) = (mn)a.
In this case the identity is written as 0, rather than e. And here, for all a € G, we have
Oa = 0, where the “O” in front of a is the integer 0 (in Z) while the “0” on the right side of
the equation is the identity 0 (in G). [So these two “O”’’s are different.|
For an abelian group G we also find that for all m € Z and alla, b € G, (1) (ab)" = a"b",
when G is written multiplicatively; and (2) n(a + b) = na + nb, when the additive notation
is used for G.
We now take a look at a special subset of a group.
Let G = (Ze, +). If H = {0, 2, 4}, then H is a nonempty subset of G. Table 16.4 shows
EXAMPLE 16.5
that (H, +) 1s also a group under the binary operation of G.
Table 16.4
+ 0 2 4
0 0 2 4
2 2 4 0
4 4 0 2
This situation motivates the following definition.
Definition 16.3 Let G be a group and # # H CG. If H is a group under the binary operation of G, then
we call H a subgroup of G.
a) Every group G has {e} and G as subgroups. These are the trivial subgroups of G. All
EXAMPLE 16.6
others are termed nontrivial, or proper.
b) In addition to H = {0, 2, 4}, the subset K = {0, 3} is also a (proper) subgroup of
G = (Ze, +).
c) Each of the nonempty subsets {1, 8} and {1, 4, 7} is a subgroup of (Uo, -).
d) The group (Z, +) is a subgroup of (Q, +), which is a subgroup of (R, +). Yet Z*
under multiplication is not a subgroup of (Q”*, +). (Why not?)
For a group G and 6 # H CG, the following tells us when H is a subgroup of G.
THEOREM 16.2 If H is anonempty subset of a group G, then H is a subgroup of G if and only if (a) for all
a,bé€ H,abeé H, and (b) forallae H,a'! eH.
Proof: If H is a subgroup of G, then by Definition 16.3 H is a group under the same binary
operation. Hence it satisfies all the group conditions, including the two mentioned here.
Conversely, let 6 # H CG with H satisfying conditions (a) and (b). For all a, b, cE H,
(ab)c = a(bc) in G, so (ab)c = a(bc) in H. (We say that H “inherits” the associative
16.1 Definition, Examples, and Elementary Properties 749
property from G.) Finally, as H # %, leta € H. By condition (b), a~! € H and by condition
(a), aa~! =e € H, so H contains the identity element and is a group.
A finiteness condition modifies the situation.
THEOREM 16.3 If G is a group and # # H CG, with H finite, then H is a subgroup of G if and only if H
is closed under the binary operation of G.
Proof: As in the proof of Theorem 16.2, if H is a subgroup of G, then A is closed under
the binary operation of G. Conversely, let H be a finite nonempty subset of G that is so
closed. If a € H, then aH = {ah|h € H} C A because of the closure condition. By left-
cancellation in G, ah) = ahy > hy = ho, so |aH| = |H|. WithaH CH and |aH| =|A|,
it follows from H being finite that aH = H. Asa € H, there exists b € H with ab =a.
But (in G) ab = a = ae, so b =e and H contains the identity. Since e € H = aH, there
is an element c € H such that ac = e. Then (ca)* = (ca)(ca) = (e(ac))a = (ce)a = ca =
(ca)e, so ca = e, and c =a! € H. Consequently, by Theorem 16.2, H is a subgroup
of G.
The finiteness condition in Theorem 16.3 is crucial. Both Z* and N are nonempty closed
subsets of the group (Z, +), yet neither has the additive inverses needed for the group
structure.
The next example provides a nonabelian group.
Consider the first equilateral triangle shown in Fig. 16.1(a). When we rotate this triangle
EXAMPLE 16.7
counterclockwise (within its plane) through 120° about an axis perpendicular to its plane
and passing through its center C, we obtain the second triangle shown in Fig. 16.1(a).
As a result, the vertex originally labeled 1 in Fig. 16.1(a) is now in the position that was
originally labeled 3. Likewise, 2 is now in the position originally occupied by 1, and 3
has moved to where 2 was. This can be described by the function 7: {1, 2, 3} > {1, 2, 3},
where 7(1) = 3, (2) = 1, 2)(3) = 2. A more compact notation, (; 7 3), where we
write 2; (i) below i for each 1 <7 <3, emphasizes that 7; is a permutation of {1, 2, 3}.
If 22 denotes the counterclockwise rotation through 240°, then 22 = ( ; ;). For the
identity 779
— that is, the rotation through n(360°) for n € Z—we write m =(; 5 4).
These rotations are called rigid motions of the triangle. They are two-dimensional motions
that keep the center C fixed and preserve the shape of the triangle. Hence the triangle looks
the same as when we started, except for a possible rearrangement of the labels on some of
its vertices.
2 3 2 1
4 N ry
———_> A —_>
1 3. 2 1 1 3>2 3
(a) (b)
Figure 16.1
750 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
In addition to these rotations, the triangle can be reflected along an axis passing through
a vertex and the midpoint of the opposite side. For the diagonal axis that bisects the base
angle on the right, the reflection gives the result in Fig. 16.1(b). This we represent by
n=(6 7 3): A similar reflection about the axis bisecting the left base angle yields the
permutation r2 = ( 3 5). When the triangle is reflected about its vertical axis, we have
r3=(3 3 }).Eachr,, for 1 <i <3, isa three-dimensional rigid motion.
Let G = {m0, 7, 42.11, 72, 73}, the set of rigid motions (in space) of the equilateral
triangle. We make G into a group by defining the rigid motion @f, for a, B € G, as that
motion obtained by applying first a and then following up with 8. Hence, for example,
\r, = r3. We can see this geometrically, but it will be handy to consider the permutations
as follows: zr) = G 7 G6 3 3), where, for example, 7) (1) = 3 and r;(3) = 3 and
we write 1 —> 3 —> 3.So 1 ——> 3 inthe product 7;r,. (Note that the order in which we
write the product zr; here is the opposite of the order for their composite function as defined
in Section 5.6. The notation of Section 5.6 occurs in analysis, whereas in algebra there is a
tendency to employ this opposite order.) Also, since 2 1 2 and3 “5 2-5 Lit
follows that mir) = (3. 3 }) =r.
Table 16.5 verifies that under this binary operation G is closed, with identity zg. Also
my! = 72, Wy l= wry, and every other element is its own inverse. Since the elements of
G are actually functions, the associative property follows from Theorem 5.6 (although in
reverse order).
Table 16.5
ITO Ty IT2 ry r2 r3
To Io I IT? r\ r2 V3
IT] IT) 2 ITO F3 ry a)
IU3 U2 IT) Ty ro rs r|
r| Fry r2 63 IO IT} 2
r2 r2 r3 ry U2 IQ IT
P3 r3 FY r2 7] IT2 IE)
We computed zr; as 73, but from Table 16.5 we see that rj) = rz. With mr, = 13 F
rz = r17, it follows that G is nonabelian.
This group can also be obtained as the group of all permutations of the set {1, 2, 3} under
the binary operation of function composition. It is denoted by $3 (the symmetric group on
three symbols).
The symmetric group Sy consists of the 24 permutations of {1, 2, 3, 4}. Here 2 =
EXAMPLE 16.8
(i 5 3 4) is the identity. If a=() 7 3 3) B= 7 3 4), then ap =
(i 3 G3) but Ba = (4 5? 4).8o Sq is nonabelian. Also, 67! = Gad ‘) and
a’ = x9 = f°. Within S, there is a subgroup of order 8 that represents the group of rigid
motions for a square.
We turn now to a construction for making larger groups out of smaller ones.
16.1 Definition, Examples, and Elementary Properties 751
THEOREM 16.4 Let (G, 0) and (#/, *) be groups. Define the binary operation - on G X H by (g), hj) +
(g2, h2) = (g1 0 g2, hy * Az). Then (G X H, -) is a group and is called the direct product
of G and H.
Proof: The verification of the group properties for (G X H, -) is left to the reader.
Consider the groups (Z2, +), (Z3, +). On G = Z X Z3, define (a), b,) - (ax, bz) =
EXAMPLE 16.9
(a, + a2, b; + 62). Then G is a group of order 6 where the identity is (0, 0), and the
inverse, for example, of the element (1, 2) is (1, 1).
b) Make a group table for these rigid motions like the one
EXERCISES 16.1 in Table 16.5 for the equilateral triangle. What is the iden-
tity for this group? Describe the inverse of each element
1. For each of the following sets, determine whether or not the
geometrically.
set is a group under the stated binary operation. If so, determine
its identity and the inverse of each of its elements. If it is not a 13. a) How many rigid motions (in two or three dimensions)
group, state the condition(s) of the definition that it violates. are there for a regular pentagon? Describe them geometri-
cally.
a) {—1, 1} under multiplication
b) Answer part (a) for a regular n-gon, n > 3.
b) {—1, 1} under addition
14. In the group Ss, let
c) {—1, 0, 1} under addition
12 3 4 5 and B 123 4 5
d) {10n{n € Z} under addition a= n = .
23 1 4 5 215 3 4
e) The set of all one-to-one functions g: A — A, where
Determine af, Ba, a7, B*, a~', Bo', (wB)7!, (Ba)7!, and
A = {1, 2, 3, 4}, under function composition
Bolan!.
f) {a/2"|a, n € Z, n > 0} under addition
15. If G is a group, let H = {a € Glag = ga for all
g € G}.
2. Prove parts (c) and (d) of Theorem 16.1. Prove that H is a subgroup of G. (The subgroup # is called the
3. Why is the set Z not a group under subtraction? center of G.)
4, LetG = {g € Qlq # —1}. Define the binary operation o on 16. Let w be the complex number (1//2)(1 + i).
Gbyxoy =x + y + xy. Prove that (G, 0) 1s an abelian group. a) Show that w® = | but w” # 1 forn € Zt, 1 <n <7.
5. Define the binary operation o on Z by xoy=x+yH+1. b) Verify that {w"|n € Z*, 1 <n < 8} is an abelian group
Verify that (Z, o) is an abelian group. under multiplication.
6. Let S = R* X R. Define the binary operation o on S 17. a) Prove Theorem 16.4.
by (u, v) o (x, y) = (ux, vx + y). Prove that (5, 0) is a non- b) Extending the idea developed in Theorem 16.4 and
abelian group. Example 16.9 to the group Z¢ X Zg X Zs = Z;, answer
7. Find the elements in the groups U2) and U>4 — the groups the following.
of units for the rings (Zao, +, +) and (Zz4, +, -), respectively. i) What is the order of this group?
ii) Find a subgroup of Z; of order 6, one of order 12,
8. For any group G prove that G is abelian if and only if
and one of order 36.
(aby = a*b* for alla, be G.
iii) Determine the inverse of each of the elements
9. If G is a group, prove that for all a, b € G, (2, 3, 4), (4, 0, 2), (5, 1, 2).
a) (a!) '=a b) (ab)! = ba"! 18. a) If H, K are subgroups of a group G, prove that H 1 K
10. Prove that a group G is abelian if and only if forall a, b € G, is also a subgroup of G.
(ab)"! = ao! bo. b) Give an example of a group G with subgroups H, K
ll. Find all subgroups in each of the following groups. such that H U K is not a subgroup of G.
a) (Zi2, +) b) (Zi, +) c) 53 19, a) Find allx in (Z®, -) such thatx = x7~!.
12. a} How many rigid motions (in two or three dimensions) b) Find all x in (Z*,, +) such that x = x71.
are there for a square? c) Letp bea prime. Find all x in (Z*, -) such thatx = x7!.
752 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
d) Prove that (p — 1)! = —1 (mod p), forpa prime. [This 20. a) Findx in (U,, +) where x # 1, x #7 butx = xt,
result is known as Wilson’s Theorem, although it was only b) Find x in (Uys, ») where x # 1, x # 15 butx =x74.
conjectured by John Wilson (1741-1793). The first proof
c) Let ke Zt, k > 3. Find x in (Ux, +) where x #1,
was given in 1770 by Joseph Louis Lagrange (1736—1813).]
x #2 —1 butx =x!.
16.2
Homomorphisms, lsomorphisms,
and Cyclic Groups
We turn our attention once again to functions that preserve structure.
Let G = (Z, +) and H = (Z4, +). Define f: G > H by
EXAMPLE 16.10
f(x)
= [x] = {x + 4k|
k € Zh}.
For all x, y € G,
f(ix+y)=(4+ y])=LeIl4+ lyl= f(@)
+ fo),
t t
The operation in G The operation in #
where the second equality follows from the way the addition of equivalence classes was
developed in Section 14.3. Consequently, here f preserves the group operations and is an
example of a special type of function that we shall now define.
Definition 16.4 If (G, o) and (H, *) are groups and f: G > H, then f is called a group homomorphism if
for alla, bE G, f(aob) = fla) f(b).
When we know that the given structures are groups, the function f is simply called a
homomorphism.
Some properties of homomorphisms are given in the following theorem.
THEOREM 16.5 Let (G, 0), (H, *) be groups with respective identities eg, ey. If f: G > H is ahomomor-
phism, then
a) f(eg) =e. b) f(a!) =[f(a)]~! for alla €G.
c) f(a") = [f(a@)] for alla € G and allan € Z.
d) f(S) is a subgroup of H for each subgroup S of G.
Proof:
a) en * f(eg) = fleg) = fleg cecg) = fleg) * f(eg), so by right-cancellation
[Theorem 16.1(d)], it follows that f(eg) = ex.
b) & c) The proofs of these parts are left for the reader.
16.2. Homomorphisms, lsomorphisms, and Cyclic Groups 753
d) If S is a subgroup of G, then S 4 @, so f(S) # @. Let x, y € f(S). Then x =
f(a), y = f(b), for some a, be S. Since S is a subgroup of G, it follows
that aobeS, so xx y= f(a) * f(b) = f(aob) € f(S). Finally, x7! =
[f(a)]"' = f(a!) € f(S) because a~! € S when a € S. Consequently, by
Theorem 16.2, f(S) is a subgroup of H.
Definition 16.5 If f: (G, 0) > (H, *) is ahomomorphism, we call f an isomorphism if it is one-to-one
and onto. In this case G, H are said to be isomorphic groups.
Let f: (R*, -) > (R, +) where f (x) = log, (x). This function is both one-to-one and onto.
EXAMPLE 16.11
(Verify these properties.) For all a,b ER*, f (ab) = log, (ab) = logy a + log, b =
f(a) +f (b). Therefore, f is an isomorphism and the group of positive real numbers under
multiplication is abstractly the same as the group of all real numbers under addition. Here
the function f translates a problem in the multiplication of real numbers (a somewhat diffi-
cult problem without a calculator) into a problem dealing with the addition of real numbers
(an easier arithmetic consideration). This was a major reason behind the use of logarithms
before the advent of calculators.
Let G be the group of complex numbers {1, —1, 4, —/} under multiplication. Table 16.6
EXAMPLE 16.12
shows the multiplication table for this group. With H = (Z4, +), consider f: G — H de-
fined by
FO
= f-D=2) FM) fi) =[3).
Then f(@)(—)) = FC) = [0] = 114+ 1B) = f@ + F(-2), and f((—-1)(-i)) = FO
= [1] = (21+ [3] = f(-1) + f(—2).
Although we have not checked all possible cases, the function is an isomorphism. Note
that the image under f of the subgroup {1, —1} of G is {[0], [2]}, a subgroup of H.
Table 16.6
1 —1 i —i
1 1 —1 i —I
— —1 1 -1 i
i L —i —1 1
—I —1 I 1 —1
Let us take a closer look at this group G. Here i} =i,i7 = -1,i°? = -i, and i* = 1,
so every element of G is a power of 7, and we say that i generates G. This is denoted by
G = (i). (tis also true that G = (—i). Verify this.)
The last part of the preceding example leads us to the following definition.
Definition 16.6 A group G 1s called cyclic if there is an element x € G such that for each a € G, a = x" for
some n € Z.
754 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
a) The group H = (Z4, +) is cyclic. Here the operation is addition, so we have multiples
EXAMPLE 16.13 instead of powers. We find that both [1] and [3] generate H. For the case of [3], we
have 1 - [3] = [3], 2-13] (= [314+ [3D = [21, 3 - [3] = [1], and 4- [3] = [0]. Hence
H = {[3]) = ([1]).
b) Consider the multiplicative group Us = {1, 2, 4, 5, 7, 8} that we examined in Exam-
ple 16.4. Here we find that 21=2 27 =4 23 =8 24=7, 2 =5, 2° = 1, so Us is
a cyclic group of order 6 and Uy, = (2). It is also true that Uo = (5) because 51 =5,
5° = 7,59 =8,54 =4,5° =2,5°=1.
The concept of a cyclic group leads to a related idea. Given a group G, if a € G consider
the set S = {a*|k € Z}. From Theorem 16.2 it follows that S is a subgroup of G. This
subgroup is called the subgroup generated by a and is designated by (a). In Example 16.12
(i) = (-i) = G; also, (—1) = {-1, 1} and (1) = {1}. For part (a) of Example 16.13 we
consider multiples instead of powers and find that H = ([1]) = ((31), (21) = {[0], [2]},
and ({0]) = {[0]}. When we examine the group Us in part (b) of that example we see that
Uy = (2) (or ([2])) = (5), (4) = (1, 4, 7} = (7), (8) = (1, 8}, and (1) = {1}.
Definition 16.7 If G is a group and a € G, the order of a, denoted 6(a), is |(a)|. (If |(a}| is infinite, we say
that a has infinite order.)
In Example 16.12, o(1) = 1, c(—1) = 2, whereas bothi and —i have order 4.
Let us take a second look at the idea of order for the case where |(a)| is finite. When
\(a)| = 1 thena = e becausea = a! € (a) ande =a" € (a). If |(a)| is finite buta # e, then
(a) = {a™|m € Z} is finite, so {a, a7, a®, .. .} = {a |m € Z*} is also finite. Consequently,
there exist s, t € Z+, where 1 <5 <¢t and a’ = a’ —from which it follows that a’ = e,
with tf —s €Z*. Since e € {a”|m € Z*}, let n be the smallest positive integer such that
a” = e. We claim that (a) = {a, a7, a®,..., a”~!, a” (= )}.
First we observe that |{a, a”, a*,..., a”~', a” (= e)}| =n. Otherwise, we have a” =
a’ for positive integers u, v where 1 <u <v <n, and then a”“ =e withO<v—u<
n. This, however, contradicts the minimality of n. So now we know that |{a}| > n. But
for each k € Z, it follows from the division algorithm that k = gn +r, where 0 <r <a,
and so ak = af”+" = (a")4(a") = (e4)(a") = a" € fa, a? a?,..., a"!, a" (=e =a")}.
Therefore, (a) = {a, a”, a*,..., a"~!, a" (= e)} and we can also define ¢(a) as the smallest
positive integer n for which a” = e. This alternative definition for the order of a group
element (of finite order) proves to be of value in the following theorem.
THEOREM 16.6 Let a € G with o(a) =n. Ifk € Z and ak = e, then n|k.
Proof: By the division algorithm (again), we have k = gn +r, for 0<r <n, and so it
follows that e = a* = a4"*" = (a")4(a") = (e4)(a") = a’. If0 < r <n, we contradict the
definition of n as e(a). Hence r = O and k = gn.
We now examine some further results on cyclic groups. The next example helps us to
motivate part (b) of Theorem 16.7.
It is known from part (b) of Example 16.13 that Ug = {1, 2, 4, 5, 7, 8} = (2). We use this
EXAMPLE 16.14
fact to define the function f: Uy — (Ze, +) as follows:
16.2 Homomorphisms, lsomorphisms, and Cyclic Groups 755
fC) = [0] f2) =11] fA
= [2]
FO=f@=51 fM=f2=41 f= f2)=(1.
So, in general, for each a € Uy we write a = 2*, for some 0 < k <5, and have f(a) =
f (2) = [k]. This function f is one-to-one and onto and we find, for example, that f (2-5) =
fC) = [0] = 11] + 15] = f(2) + fS), and f(7-8) = f(2) = [1] = 14) + [3] = fF)+
f (8).
In general, for a, b in Ug we may write a = 2” and b = 2”, where 0<m <5 and
Q <n <5. It then follows that
fla. b) = fQ™-2") = f2"™) = [m+n] = [m] + [nl = f@) + fF).
Consequently, the function f is an isomorphism and the groups Us and (Z6, +) are iso-
morphic.
[Note how the function f links the generators of the two cyclic groups. Also note that
the function g: Ug > (Ze, +) where
g(1) = [0] g(5) = [1] g(7) = g(5°) = [2]
9(8)=9(5°)=[3] g(4=8(5*)=14) = 2) = gS?) = 15]
is another isomorphism between these two cyclic groups.]
THEOREM 16.7 Let G be a cyclic group.
a) If |G| is infinite, then G is isomorphic to (Z, +).
b) If |G| =, where n > 1, then G is isomorphic to (Z,, +).
Proof:
a) For G = (a) = {a*|k € Z}, let f: G > Z be defined by f(a*) = k. (Could we have
ak = a' with k # t? If so, f would not be a function.) For a”, a” € G, f(a” +a") =
fla”*") =m+n= f(a") + f(a"), so f isahomomorphism. We leave to the reader
the verification that f is one-to-one and onto.
b) If G = (a) = {a,a’,..., a’—', a” = e}, then the function f: G > Z, defined by
f (a*) = [k] is an isomorphism. (Verify this.)
If G = (g), G is abelian because g” - g” = g’"*" = g?*™ = g". 9” for all m,n € Z. The
EXAMPLE 16.15
converse, however, is false. The group H of Table 16.7 is abelian, and ¢(e) = 1, G(a) =
e(b) = o(c) = 2. Since no element of H has order 4, H cannot be cyclic. (The group H is
the smallest noncyclic group and is known as the Klein Four group.)
Table 16.7
€ a b Cc
e € a b Cc
a a e€ Cc b
b b Cc e a
C c b a e
Our last result concerns the structure of subgroups in a cyclic group.
756 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
THEOREM 16.8 Every subgroup of a cyclic group is cyclic.
Proof: Let G = (a). If H is a subgroup of G, each element of H has the form a*, for some
k €Z. For H # {e}, lett be the smallest positive integer such that a’ € H. (How do we know
such an integer f exists?) We claim that H = (a’). Since a’ € H, by the closure property
for the subgroup H, (a') C H. For the opposite inclusion, let b € H, with b = a‘, for some
s € Z. By the division algorithm, s = gt +r, whereg, r € ZandO <r <t. Consequently,
a’ = a%'*" andsoa’ = a-#'a* = (a')~¢b. H isa subgroup of G, soa’ € H = (a')4 EH.
Then with (a')~“%, b € H, it follows that a” = (a')"4b € H. But if a” € H withr > 0, then
we contradict the minimality oft. Hence r = 0 and b = a” = (a')4 € (a'), so H = (a’),a
cyclic group.
8. In S; find an element of order n, for all 2 <n <5, Also de-
EXERCISES 16.2 termine the (cyclic) subgroup of S; that each of these elements
generates.
1. Prove parts (b) and (c) of Theorem 16.5.
2. Laa=|_f 01 a
9, a) Find all the elements of order 10 in (Z4y, +).
b) Let G = (a) be acyclic group of order 40. Which ele-
ments of G have order 10?
a) Determine A”, A’, and A‘.
b) Verify that {A, A*, A?, A*} is an abelian group under 10. a) Determine U4, the group of units for the ring
ordinary matrix multiplication. (Zy4, +, +).
c) Prove that the group in part (b) is isomorphic to the b) Show that U4 is cyclic and find all of its generators.
group shown in Table 16.6.
11. Verify that (Z*, -) is cyclic for the primes 5, 7, and 11.
3. If G = (Ze, +), H = (Z3, +), and K = (Z», +), find an
isomorphism for the groups H X K andG. 12, For a group G, prove that the function f: G + G defined
by f(a) = a is an isomorphism if and only if G is abelian.
4. Let f: G - H bea group homomorphism onto H. If G is
abelian, prove that H is abelian. 13. If f: G + H, g: H — K are homomorphisms, prove that
5. Let (ZX Z, @) be the abelian group where (a, b)@ the composite function go f:G— K, where (go f)(x) =
(c,d) = (a+c,b+d)—herea+c and b+d are computed g(f (x)), is a homomorphism.
using ordinary addition in Z—and let (G, +) be an addi- 14, For w = (1//2)(1 +i), let G be the multiplicative group
tive group. If f: Z x Z— G isa group homomorphism where {w"\Inée Zt, 1 <n <8}.
fd, 3) = g; and f(3, 7) = go, express f (4, 6) in terms of g;
a) Show that G is cyclic and find each elementx € G such
and 22-
that (x) = G.
6. Let f: (ZX Z, 6) — (Z, +) be the function defined by
b) Prove that G is isomorphic to the group (Zs, +).
f(x, y) =x — y. [Here (Z X Z, @) is the same group as in
Exercise 5, and (Z, +) is the group of integers under ordinary 15, a) Find all generators of the cyclic groups (Z12, +),
addition.] (Zio, +), and (Z4, +).
a) Prove that f is a homomorphism onto Z. b) Let G = (a) with c(a) =n, Prove that a*, k € Z*, gen-
b) Determine all (a, b) € Z X Z with f(a, b) = 0. erates G if and only if k and n are relatively prime.
c) Find f~'(7). c) If G is a cyclic group of order n, how many distinct
generators does it have?
d) If E = {2n|n € Z}, whatis f-'(E)?
7, Find the order of each element in the group of rigid motions 16. Let f: G — H be a group homomorphism. If a € G with
of (a) the equilateral triangle; and (b) the square. e(a) =n, and c(f(a)) = k (in A), prove that k|n.
16.3 Cosets and Lagrange’s Theorem 757
16.3
Cosets and Lagrange’s Theorem
In the last two sections, for all finite groups G and subgroups H of G, we had || dividing
|G|. In this section we’ ll see that this was not mere chance but is true in general. To prove
this we need one new idea.
Definition 16.8 If H is a subgroup of G, then for each a € G, the setaH = {ah|h € H} is called a left coset
of H in G. The set Ha = {ha|h € H} is aright coset ofH inG.
If the operation in G is addition, we write a + H in place of aH, where a+ H =
{fa thlh € H}.
When the term coset is used in this chapter, it will refer to a left coset. For abelian groups
there is no need to distinguish between left and right cosets. However, at the end of the next
example we’ll see that this is not the case for nonabelian groups.
If G is the group of Example 16.7 andH = {70, m, 72}, the cosetr;H = {rj70, 7171, 1172}
EXAMPLE 16.16 = {r}, ro, 7r3}. Likewise we have mH =r3H = {ri, ro, 73}, whereas 9H = 2,H =
mH = H.
We see that |aH| = |H| for each a € G and that G = H Ur|A isa partition of G.
For the subgroup K = {2o, ri}, we find r2K = {r2, m2} and r3K = {r3, m1}. Again a
partition of G arises: G = K UroK Ur3k. (Note: Kro = {mora, rir} = {ro, mi} FmK.)
For G = (Zj2, +) and H = {[0], [4], [8]}, we find that
EXAMPLE 16.17
[0]
+ H = {[0], [4], (8]} = [41+ 4 =|[8])+ H =H
[1] + A = {[1], [5], [91} = (5]4+ 4 = [9] +H
[2] + H = {[2]. [6], [10]} = [6] + AH =[10)+ 4
(3]+ A = {(3], 17], (1} =W14+ # = [11] + 4,
and H U ({1] + #2) U ((2]| + A) U ([3] + A) is a partition of G.
These examples now prepare us for the following results.
LEMMA 16.1 If H is a subgroup of the finite group G, then for all a, b € G, (a) |aH| = ||; and (b) either
aH =bH oraH N1bH = &.
Proof:
a) Since aH = {ah\|h € H}, it follows that |aH| < |A|. If |aH| <|H|, we have ah, =
ah; with h,, h; distinct elements of H. By left-cancellation in G we then get the
contradiction h; = h;,so |aH| = |A|.
b) If aH 1 bH F YG, let c = ah, = bho, for some hy, hy € H. Ifx € aH, then x = ah
forsomeh € H,andsox = (bhyh;')h = b(h2hy'h) € bH,andaH C bH. Similarly,
y€bH=> y = bhz, for some h3 €H > y= (ah, h5')hy = a(hyh;'h3) € aH, so
bH CaH. Therefore aH and bH are either disjoint or identical.
758 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
We observe that ifg € G, then g € gH because e € H. Also, by part (b) of Lemma 16.1,
G can be partitioned into mutually disjoint cosets.
At this point we are ready to prove the main result of this section.
THEOREM 16.9 Lagrange’s Theorem. If G is a finite group of order n with H a subgroup of order m, then
m divides n.
Proof: If H = G the result follows. Otherwise m < n and there exists anelementa € G — H.
Sincea ¢ H,itfollowsthataH # H,soaHN H =@.IfG =aH U H,then|G| = |aH|+
|| = 2|H| and the theorem follows. If not, there is an element b € G — (H Ua), with
bHO H =@$=bHAN4H and |bA| =|H|. If G=bH VaH UA, we have |G] = 3|H|.
Otherwise we’re back to an element c € G withe ¢ bH UadH U H. The group G is finite,
so this process terminates and we find that G = a; H Ua2H U---Ua,H. Therefore, |G| =
k|H| and m divides n.
An alternative method for proving this theorem is given in Exercise 12 for this section.
We close with the statements of two corollaries. Their proofs are requested in the Section
Exercises.
COROLLARY 16.1 If G is a finite group and a € G, then c(a) divides |G].
COROLLARY 16.2 Every group of prime order is cyclic.
b) How many left cosets of H are there in G?
> dC
@ hOB om:
c) Consider the group (Z2 X Z.,@) where (a,b)@
1. Let G = Sy. (a) Fora =( } ; 3 |), find the sub- (c,d) = (a@+c, b+ d)—and the sums a+c, b+d are
computed using addition modulo 2. Prove that H is iso-
group H = (a). (b) Determine the left cosets of H in G.
morphic to this group.
2, Answer Exercise | for the case where @ is replaced by
_f1 3 4 8. If G is a group of order n and a € G, prove that a” = e.
P=(3 53 7 4).
9
9, Let p be a prime. (a) If G has order 2p, prove that every
1 2 3 4 proper subgroup of G is cyclic. (b) If G has order p?, prove that
3. Ify = ( > | 4 3 ) € S,, how many cosets does (y)
G has a subgroup of order p.
determine?
10. Prove Corollaries 16.1 and 16.2.
4. For G = (Zz4, +), find the cosets determined by the sub-
group H = ({[3]}. Do likewise for the subgroup K = ([4}]). 11. Let H and K be subgroups of a group G, where e is the
identity of G.
5. Let G be a group with subgroups H and K. If |G| = 660,
|K| = 66, and K C H CG, what are the possible values for a) Prove thatif|H| = 10and|K| = 21,thenH 1 K = {e}.
|H |? b) If |H| = mand |X| =n, with gcd(m, n) = 1, prove that
6. Let X be a ring with unity uv. Prove that the units of R form HOOK = {e}.
a group under the multiplication of the ring. 12. The following provides an alternative way to establish
7, Let G = S4, the symmetric group on four symbols, and let Lagrange’s Theorem. Let G be a group of order n, and let H
H be the subset of G where be a subgroup of G of order m.
H= 1234 1234 1234 1234 a) Define the relation & on G as follows: If a, b € G, then
“W123 47)2143/7 3 4127 \4321)° aR bifa-'b € H. Prove that R is an equivalence relation
a) Construct a table to show that H is an abelian subgroup on G.
of G. b) Fora, b € G, prove thata R bif and only ifaH = bH.
16.4 The RSA Cryptosystem (Optional) 759
c) Ifa € G, prove that [a}, the equivalence class of a under b) Euler’s Theorem. For each n € Z*,n > 1, and each
R, satisfies [a] = aH. a € Z, prove that if gcd(a, n) = 1, then a?” = 1(mod n).
d) For each a € G, prove that |aH| = ||. c) How are the theorems in parts (a) and (b) related?
e) Now establish the conclusion of Lagrange’s Theorem, d) Is there any connection between these two theorems and
namely that |H| divides |G]. the results in Exercises 6 and 8?
13. a) Fermat’s Theorem. If p is a prime, prove that a? =a
(mod p) for each a € Z. [How is this related to Exercise
22(a) of Section 14.37]
16.4
The RSA Cryptosystem (Optional)
This section provides us with an opportunity to use some of the theoretical ideas we en-
countered in Sections 14.3 and 16.3 ina more contemporary application.
In Example 14.15 of Section 14.3 we introduced two private-key cryptosystems: the
cipher shift and the affine cipher. For an alphabet of m characters, the encryption function
E: Zn, — Z», for the cipher-shift cryptosystem, is given by E (0) = (6 + «) mod m, where
6,« € Zn, fork (# 0) fixed. (Using « = 0 would not alter any of the characters in a mes-
sage.) Consequently, there are m — 1 possibilities to examine in an attempt to discover the
value of the key «. Further, once we know the value of «, we also know the decryption func-
tion D: Z, > Z,,, for D(@) = (@ — «) mod m. Inthe case of the affine-cipher cryptosystem
(also with an alphabet of m characters) the encryption function FE: Z,, —> Zm is now given
by E(@) = (#9 + «) mod m, where 6, a, k € Z,,, for fixeda, «, witha invertible in Z,,, [or,
equivalently, with gcd(a, m) = 1]. Here the decryption function D: Z, > Zm is given by
D(6) = [a~!(@ — «)] mod m. Without prior knowledge of the key (a, «), now one would
have to check m@(m) possibilities to discover the appropriate values of w and x for this
private-key cryptosystem.
The security of either of the above cryptosystems depends on having the key [be it « or
(a, «)] known only to the sender and the recipient of the messages.
The RSA cryptosystem is an example of a public-key cryptosystem. This cryptosystem
was developed in the 1970s (and patented in 1983) by Ronald Rivest (1948— ), Adi Shamir
(1952— ), and Leonard Adleman (1945— ). (Taking the first letter from the surname of each
of these three men provides the adjective RSA.)
We shall describe how this cryptosystem works and provide an example for encryption
and decryption. In so doing, we shall find ourselves using some of the results from Sections
14.3 and 16.3.
As with the two private-key cryptosystems, once again we have an alphabet of m characters.
EXAMPLE 16.18
We start with two distinct primes p, q. In practice, these should be large primes
— each
with 100 or more digits. (However, for our example we shall use much smaller primes.)
After selecting the primes p, g, we then consider the integers n = pg andr = (p—1)-
(¢q — 1) = (p)o(q) = (pq) = ¢(n), and, at this point, we choose an invertible element
ein Z, = (Zg(n))-
[Here, if the element e is chosen at random, then the only time we fail to obtain an
invertible element is when the element chosen is a multiple of p (there are g possibilities) or
a multiple of g (there are p possibilities). In this count of p + g elements we have accounted
for pg twice, so there are only p + g — | possibilities for failure. Hence, the probability for
760 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
failure is (p + g — 1)/(pq) = (1/q) + C1/p) — (1/(pq)), a very small number if p andg
each have 100 or more digits.|
For instance, consider p = 61, g = 127, with n = (61)(127) = 7747 and r = #(61) -
@ (127) = (60)(126) = 7560. Now suppose we select e as 17.
Consider the following message that we wish to encrypt.
INVEST IN BONDS
Using the same plaintext assignments as in part (b) of Example 14.15, here we would replace
the letter “I” by 08 (not merely 8). Then we replace “N” by 13. This provides us with the
first block of four digits— namely, 0813 —for the first two letters “IN”. The assignment
for the complete message is as follows [where we have appended the letter ““X”’ to the right
end, in order for the final block to have two letters (or, four digits)]:
I N V E S T IT N BON D S X
08 13 21 04 18 19 O8 13 O1 14 13 O03 18 23
We now encrypt each block 8 of four digits by the encryption function E, where E(B) =
B° mod n. (This modular exponentiation can be carried out efficiently by using the proce-
dure in Example 14.16.) So here the domain of E is the concatenation of Z2. with itself,
and we find that
0813!? mod 7747 = 2169 2104!’ mod 7747 = 0628 1819!” mod 7747 = 5540
0813!’ mod 7747 = 2169 0114!’ mod 7747 = 6560 1303!7 mod 7747 = 6401
1823'? mod 7747 = 4829.
Consequently, the recipient of the encrypted assignment (for the given plaintext message)
receives the ciphertext
2169 0628 5540 2169 6560 6401 4829.
Now the question is: “How does the recipient decrypt the ciphertext received?”
Since e is a unit in Z, (= Zgin)), we can use the Euclidean algorithm (as in Example
14.13) to compute e~! = d. Then we define the decryption function D, where D(C) =
C4 mod a, for a block C of four digits. Since e~! = d, it follows that ed = 1 mod ¢(n) —
that is, ed mod @(n) = 1. Therefore, ed = k(n) + 1, for some k € Z. Now recall the ar-
gument given earlier for the probability that a randomly selected element e from Z,, is
invertible (or a unit in Z,,). For any block B of four digits, we consider B as an element of
Z, —1n fact, we consider B as a unit in Z,,. Since the units in the ring (Z,, +, -) forma
group of order ¢(”) under multiplication, it follows from the result in Exercise 8 of Section
16.3 that B°? = BeOC)+! = (BP™)* B! = B (mod n), or B& mod n = B. [This is also a
consequence of Euler’s Theorem, as stated in part (b) of Exercise 13 in Section 16.3.]
Applying the result from the previous paragraph in our example we have p = 61, g =
127,n = pg =7747,r = b(n) = (p — 1)(g — 1) = (60)(126) = 7560, ande = 17. From
the Euclidean algorithm we calculate d = e~! = 3113. Now we find, for instance, that
21697!!3 mod 7747 = 0813 and that 06287''3 mod 7747 = 2104. Continuing, the recipient
determines the numeric assignment for the original plaintext and then the plaintext.
Now what makes the RSA cryptosystem more secure than the private-key cryptosystems
we studied? First, we should relate that the RSA cryptosystem is not a private-key cryp-
tosystem. This system is an example of a public-key cryptosystem, where the key (n, e) is
made public. So it seems that all one needs to do to decrypt the encrypted assignment is
16.5 Elements of Coding Theory 761
to determine d = e7! in Z, (= Zgcn)). Now it is time to realize that by knowing n we do
not immediately know r. For to be able to determine r = (p — 1)(q — 1), we need to know
p,q, the prime factors of 7. And this is what makes this system so much more secure than
the other cryptosystems we mentioned. Determining the primes p, g, when they are 100
or more digits long, is not a feasible problem. However, as computer power continues to
improve, to keep the RSA cryptosystem secure, one may need to redefine the key using
primes with more and more digits.
In closing, we show how the problem of factoring the modulus n as pq is related to the
problem of determining r = (p — 1)(q — 1). We start by observing that
p+q=pq-(p-V(q-1lt+l=n—-G(*)+1l=n-rtl,
while
p—q=V(p—4)
= Vp — 9)? + 409 — 4g = V(p +4)?
— 409
= /(pt+gq)y—4n = J/(n-—r4
1)? -—4n.
Then, from these two equations, we learn that
p=(1/2yi~+at+p-@q=0/2YIa#—-r+1I+V(n—r +1)? —4n]
and
q = (1/2)(p +4) —(p—@))] = 1/2) —r +) -—V(in~rt le —4n].
Consequently, when we know n andr, then we can readily determine the primes p, g such
that 2 = pq.
3. Determine the plaintext for the RSA ciphertext 1418 1436
EXERCISES 16.4 2370 1102 1805 0250, if e = 11 and = 2501.
The use of a computer algebra system 1s strongly recom- 4. Determine the plaintext for the RSA ciphertext 0986 3029
mended for the first four exercises. 1134 1105 1232 2281 2967 0272 1818 2398 1153, if
e = 17 anda = 3053.
1. Determine the ciphertext for the plaintext INVEST IN 5. Find the primes p, qg if n = pq = 121,361 and (7) =
STOCKS, when using RSA encryption with e = 7 and 2 = 120,432.
2573.
2. Determine the ciphertext for the plaintext ORDERA PIZZA, 6. Find the primes p, g if n = pg = 5,446,367 and ¢(n) =
when using RSA encryption with e = 5 andn = 1459. 5,441,640.
16.5
Elements of Coding Theory
In this and the next four sections we introduce an area of applied mathematics called
algebraic coding theory. This theory was inspired by the fundamental paper of Claude
Shannon (1948) along with results by Marcel Golay (1949) and Richard Hamming (1950).
Since that time it has become an area of great interest where algebraic structures, probability,
and combinatorics all play a role.
Our coverage will be held to an introductory level as we seek to model the transmission
of information represented by strings of the signals 0 and 1.
In digital communications, when information is transmitted in the form of strings of 0’s
and 1’s, certain problems arise. As a result of “noise” in the channel, when a certain signal
is transmitted a different signal may be received, thus causing the receiver to make a wrong
762 Chapter 16 Groups, Coding Theory, and Polya‘’s Method of Enumeration
decision. Hence we want to develop techniques to help us detect, and perhaps even correct,
transmission errors. However, we can only improve the chances of correct transmission;
there are no guarantees.
Our model uses a binary symmetric channel, as shown in Fig. 16.2. The adjective binary
appears because an individual signal is represented by one of the bits 0 or 1. When a
transmitter sends the signal 0 or 1 in such a channel, associated with either signal is a
(constant) probability p for incorrect transmission. When that probability p is the same for
both signals, the channel is called symmetric. Here, for example, we have probability p of
sending 0 and having | received. The probability of sending signal 0 and having it received
correctly is then 1 — p. All possibilities are illustrated in Fig. 16.2.
0 1-p 0
e
p
Transmitted Received
signal p signal
e
1 1-p 1
The Binary Symmetric Channel
Figure 16.2
Consider the string c = 10110. We regard c as an element of the group Z>, formed from
EXAMPLE 16.19
the direct product of five copies of (Z2, +). To shorten notation we write 10110 instead of
(1, 0, 1, 1, 0). When sending each bit (individual signal) of c through the binary symmetric
channel, we assume that the probability of incorrect transmission is p = 0.05, so that the
probability of transmitting c with no errors is (0.95)° = 0.77.
Here, and throughout our discussion of coding theory, we assume that the transmission of
each signal does not depend in any way on the transmissions of prior signals. Consequently,
the probability of the occurrence of all of these independent events (in their prescribed
order) is given by the product of their individual probabilities.
What is the probability that the party receiving the five-bit message receives the string
r = 00110— that is, the original message with an error in the first position? The probability
of incorrect transmission for the first bit is 0.05, so with the assumption of independent
events, (0.05)(0.95)* = 0.041 is the probability of sending c = 10110 and receiving r =
00110. With e = 10000, we can write c + e = r and interpret r as the result of the sum of
the original message c and the particular error pattern e = 10000. Since c, r, e € Z3 and
—]1 =1in Z, we also have c+r=eandr+e=c.
In transmitting c = 10110, the probability of receiving r = 00100 is
(0.05)(0.95)7(0.05)(0.95) = 0.002,
so this multiple error is not very likely to occur,
Finally if we transmit c = 10110, what is the probability that r differs from c in exactly
two places? To answer this we sum the probabilities for each error pattern consisting of two
1’s and three 0’s. Each such pattern has probability 0.002. There are (3) such patterns, so
16.5 Elements of Coding Theory 763
the probability of two errors in transmission is given by
(3) (0.05)*(0.95)? = 0.021.
These results lead us to the following theorem.
THEOREM 16.10 Let c € Z5. For the transmission of c through a binary symmetric channel with probability
p of incorrect transmission,
a) the probability of receiving r = c + e, where e is a particular error pattern consisting
ofk 1’s and (n — k) 0’s, is p*(1 — py”.
b) the probability that (exactly) & errors are made in the transmission is
()p*a _ py ki
In Example 16.19, the probability of making at most one error in the transmission of
c = 10110 is (0.95)? + (?) (0.05)(0.95)* = 0.977. Thus the chance for multiple errors in
transmission will be considered negligible throughout the discussion in this chapter. Such
an assumption is valid when p is small. In actuality, a binary symmetric channel is considered
“sood” when p < 10~>. However, no matter what else we stipulate, we always want p <
1/2.
To improve the accuracy of transmission in a binary symmetric channel, certain types
of coding schemes can be used where extra bits are provided.
Form, n € Z*,letn > m. Consider@ #4 W C Z3'. The set W consists of the messages to
be transmitted. To each w € W are appendedn — m extra bits to form the code word c, where
c € Z. This process is called encoding and is represented by the function E: W > Z)5.
Then E(w) =c and E(W) =C CZ). Since the function E simply appends extra bits
to the (distinct) messages, the encoding process is one-to-one. Upon transmission, c is
received as T(c), where T(c) € Z5. Unfortunately, 7 is not a function because 7 (c) may be
different at different transmission times (for the noise in the channel! changes with time). (See
Fig. 16.3.)
Message w E Associated code T The received D The decoded
(an element >| wordc = E(w) {an > word T(c) (an > result (an
of Z?) element of Z3} element of 27) element of 27)
Binary symmetric channel
Figure 16.3
Upon receiving 7 (c), we want to apply a decoding function D: Z, — Z' to remove the
extra bits and, we hope, obtain the original message w. Ideally D o T o F should be the
identity function on W, with D: C > W. Since this cannot be expected, we seek functions
E and D such that there is a high probability of correctly decoding the received word 7 (c)
and recapturing the original message w. In addition, we want the ratio m/n to be as large
as possible so that an excessive number of bits are not appended to w in getting the code
"This is the binomial probability distribution that was developed in (optional) Sections 3.5 and 3.7.
764 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
wordc = E(w). This ratio m/n measures the efficiency of our scheme and is called the rate
of the code. Finally, the functions £ and D should be more than theoretical results; they
must be practical in the sense that they can be implemented electronically.
In such a scheme, the functions F and D are called the encoding and decoding functions,
respectively, of an (n, m) block code.
We illustrate these ideas in the following two examples.
Consider the (m + 1, m) block code form = 8. Let W = Zi. For each w = w,w2--- we €
EXAMPLE 16.20
W, define E: Z5 > Z3 by E(w) = w)w2--+ wgwo, where wo = 5°8_, w;, with the addi-
tion performed modulo 2. For example, £(11001101) = 110011011, and £(00110011) =
001100110.
For all w € ZS, E(w) contains an even number of 1’s. So for w = 11010110 and E(w) =
110101101, if we receive T(c) = T(E(w)) as 100101101, from the odd number of 1’s in
T(c) we know that a mistake has occurred in transmission. Hence we are able to detect
single errors in transmission. But we seem to have no way to correct such errors.
The probability of sending the code word 110101101 and making at most one error in
transmission is
(1 — p)?+ (7)p = p)®.eS
All nine bits are One bit is changed in
correctly transmitted. transmission and an error is detected.
For p = 0.001 this gives (0.999)? + (7)(0.001)(0.999)* = 0.99996417.
If we detect an error and we are able to relay a signal back to the transmitter to repeat
the transmission of the code word, and continue this process until the received word has an
even number of |’s, then the probability of sending the code word 110101101 and receiving
the correct transmission is approximately 0.99996393.*
Should an even positive number of errors occur in transmission, 7(c) is unfortunately
accepted as the correct code word and we interpret its first eight components as the original
message. This scheme is called the (m + 1, m) parity-check code and is appropriate only
when multiple errors are not likely to occur.
If we send the message 11010110 through the channel, we have probability (0.999)° =
0.99202794 of correct transmission. By using this parity-check code, we increase our
chances of getting the correct message to (approximately) 0.99996393. However, an extra
signal is sent (and perhaps additional transmissions are needed) and the rate of the code has
decreased from | to 8/9.
But suppose that instead of sending eight bits we sent 160 bits, in successive strings of
length 8. The chances of receiving the correct message without any coding scheme would be
"For p = 0.001 the probability that an odd number of errors occurs in the transmission of the code word
110101101 is
Podd = (7)(0.999)8 (0.001) + (3)(0.999)% (0.001)? + (2)(0.999)4(0.001)> + (3) (0.999)? (0.001)7 + (2)(0.001)
= 0.008928251 + 0.000000083 + 0.000000000 + 0.000000000 + 0.000000000 = 0.008928334.
With g = the probability of the correct transmission of 110101101 = (0.999)?, the probability that this code word
is transmitted and correctly received under these conditions (of retransmission) is then given by
g + Podd 4 + (Poud)?G + (Poua)?g +++ = G/(1 — Pods) = 0.99996393 (to eight decimal places).
16.5 Elements of Coding Theory 765
(0.999)!6° = 0.85207557. With the parity-check method we send 180 bits, but the chances
for correct transmission now increase to (0.999964)*" = 0.99928025.
The (3m, m) triple repetition code is one where we can both detect and correct single errors
EXAMPLE 16.21
in transmission. With m = 8 and W = Z®, we define E: Zs > zt by E(w) w- ++ w7ws) =
W1{W2 +++ WeW)W2--+s WeW)W2- ++ We.
Hence if w = 10110111, thence = E(w) = 101101111011011110110111.
The decoding function D: Z3* — Z8 is carried out by the majority rule. For example, if
T(c) = 101001110011011110110110, then we have three errors occurring in positions 4,
9, and 24. We decode T(c), by examining the first, ninth, and seventeenth positions to see
which signal appears more times. Here it is 1 (which occurs twice), so we decode the first
entry in the decoded message as |. Continuing with the entries in the second, tenth, and
eighteenth positions, the result for the second entry of the decoded message is 0 (which
occurs all three times). As we proceed, we recapture the correct message, 10110111.
Although we have more than one transmission error here, all is well unless two (or more)
errors occur with the second error eight or sixteen spaces after the first
— that is, if two (or
more) incorrect transmissions occur for the same bit of the original message.
Now how does this scheme compare with the other methods we have? With p =
0.001, the probability of correctly decoding a single bit is (0.999)? + (3) (0.001) (0.999)? =
0.99999700. So the probability of receiving and correctly decoding the eight-bit message
is (0.99999700)® = 0.99997600, just slightly better than the result from the parity-check
method (where we may have to retransmit, thus increasing the overall transmission time).
Here we transmit 24 signals for this message, so our rate is now 1/3. For this increased
accuracy and the ability to detect and now correct single errors (which we could not do in
any previous schemes), we may pay with an increase in transmission time. But we do not
waste time with retransmissions.
(ii) 000100011; Gii) 010011111.
EXERCISES 16.5
b) Find three different received words r for which D(r) =
000.
1. Let C be a set of code words, where C C Z). In each of the
c) For each w € Z5, what is |D~!(w)|?
3
following, two of e (error pattern), r (received word) and c (code
word) are given, with r = c + e. Determine the third term. 4. The (5m, m) five-times repetition code has encoding func-
a) c = 1010110, r = 1011111 tion E: Z3' > Z3”, where E(w) = wwwww. Decoding with
b) ¢ = 1010110, e = 0101101 D: Z3”" — Z is accomplished by the majority rule. (Here we
are able to correct single and double errors made in transmis-
c) e = 0101111, r = 0000111
sion.)
2. A binary symmetric channel has probability p = 0.05 of
a) With p = 0.05, what is the probability for the transmis-
incorrect transmission. If the code word c = 011011101 is sion and correct decoding of the signal 0?
transmitted, what is the probability that (a) we receive r =
011111101? (b) we receive ry = 111011100? (c) a single error b) Answer part (a) for the message 110 in place of the sig-
occurs? (d) a double error occurs? (e) a triple error occurs? nal 0.
(f) three errors occur, no two of them consecutive? c) For m = 2, decode the received word
3. Let E: Z3 — Z? be the encoding function for the (9, 3) triple r = 0111001001.
repetition code. d) If m = 2, find three received words r where D(r) = 00.
a) If D: Z} + Z} is the corresponding decoding function, e) For m = 2 and D: Z}? -> Z3, what is |D~'(w)| for each
apply D to decode the received words (i) 111101100; we Z5?
766 Chapter 16 Groups, Coding Theory, and Polya‘s Method of Enumeration
The Hamming Metric
In this section we develop the genera! principles for discussing the error-detecting and
error-correcting capabilities of a coding scheme. These ideas were developed by Richard
Wesley Hamming (1915-1998).
We start by considering a code C € Z5, where c; = O111, cp = 1111 € C. Now both
the transmitter and the receiver know the elements of C. So if the transmitter sends c;
but the person receiving the code word receives T(c1) as 1111, then he or she feels that
C2 was transmitted and makes whatever decision (a wrong one) c2 implies. Consequently,
although only one transmission error was made, the results could be unpleasant. Why is
this? Unfortunately we have two code words that are almost the same. They are rather close
to each other, for they differ in only one component.
We describe this notion of closeness more precisely as follows.
Definition 16.9 Foreachelementx = x;x2 +++ xX, € Z5, wheren € Z*, the weight of x, denoted wt(x), is the
number of components x; of x, for 1 <i <n, where x; = 1. If y € Z5, the distance between
x and y, denoted d(x, y), is the number of components where x; # y;, for] <i <n.
Forn = 5, let x = 01001 and y = 11101. Then wt(x) = 2, wt(y) = 4, andd(x, y) = 2. In
EXAMPLE 16.22
addition, x + y = 10100, so wt(x + y) = 2. Is it just by chance that d(x, y) = wt(x + y)?
For each 1 <i <5, x, + y; contributes a count of 1 to wt(x + y) <=> x; Fi <> Xi, Yi
contribute a count of 1 to d(x, y). [This is actually true for all n € Z*, so wt(x + y) =
d(x, y) forall x, y € Z5.]
When x, y € Z5, we write d(x, y) = )-7_, d(x, yi) where,
QO ifx; =y,
foreach | <i <n, d(Xi, Yi) = 1 ifx, L Ay ‘Te
LEMMA 16.2 For all x, y € Z5, wt(x + y) < wt(x) + wt(y).
Proof: We prove this lemma by examining, foreach 1 <i <n, the components x;, yj,.x; + yj,
of x, y, x + y, respectively. Only one situation would cause this inequality to be false: if
xX; + y; = 1 while x; =0 and y; = 0, for some 1 <i <n. But this never occurs because
x; + y; = 1 implies that exactly one of x; and y; is 1.
In Example 16.22 we found that
wt(x + y) = wt(10100) = 2 <2 +4 = wt(01001) + wt(11101) = wt(x) + wt(y).
THEOREM 16.11 The distance function d defined on Z5 X Z} satisfies the following for all x, y, z € Z5.
a) d(x, y)>0 b) dix, y)=O@x=y
c) d(x, y) = d(y,x) d) d(x, z)< d(x, y)+ d(y, 2)
16.6 The Hamming Metric 767
Proof: We leave the first three parts for the reader and prove part (d).
In Z, y+y=0, so d(x, z) =wt(x+z) = wt + (y+ y) +2) =wt(e + y)t+
(y + z)) < wt(x + y) + wt(y +z), by Lemma 16.2. With wt(x+ y) = d(x, y) and
wt(y + z) = d(y, z), the result follows. (This property is generally called the Triangle
Inequality.)
When a function satisfies the four properties listed in Theorem 16.11, it is called a
distance function or metric, and we call (Z5, d) a metric space. Hence d (as given above)
is often referred to as the Hamming metric. This metric is used in the following.
Definition 16.10 For n,k € Z* and x € Z5, the sphere of radius k centered at x is defined as S(x, k) =
{y € Z5| d(x, y) <k}.
For n =3 and x = 110€Z3, S(x, 1) = {110, 010, 100, 111} and S(x, 2) = {110, 010,
EXAMPLE 16.23
100, 111, 000, 101, O11}.
With these preliminaries in hand we turn now to the two major results of this section.
THEOREM 16.12 Let E: W — C be an encoding function with the set of messages W C Z*’ and the set of
code words E(W) = C C Z5, where m <n. If our objective is error detection, then for
k € Z*, we can detect all transmission errors of weight < & if and only if the minimum
distance between code words is at least k + 1.
Proof: The set C is known to both the transmitter and the receiver, so if w € W is the
message and c = E(w) is transmitted, let c # T(c) = r. If the minimum distance between
code words is at least kK + 1, then the transmission of c can result in as many as k errors
and r will not be listed in C. Hence we can detect all errors e where wt(e) < &. Conversely,
let c), C2 be code words with d(c;, c2) < k + 1. Then co = c; + e where wt(e) < k. If we
send c, and 7 (c,} = co, then we would feel that cz had been sent, thus failing to detect an
error of weight < k.
What can we say about error-correcting capability?
THEOREM 16.13 Let E, W, and C be as in Theorem 16.12. If our objective is error correction, then for
k € Z*, we can construct a decoding function D: Z3 — W that corrects all transmission
errors of weight <x if and only if the minimum distance between code words 1s at least
2k +1.
Proof: For c € C, consider S(c, k) = {x € Zi |d(c, x) < k}. Define D: Z5 — W as follows.
If r € Z and r € S(c, k) for some code word c, then D(r) = w where E(w) = c. [Here
c is the (unique) code word nearest to r.] If r ¢ S(c, k) for any c EC, then we de-
fine D(r) = wo, where wo is some arbitrary message that remains fixed once it is cho-
sen. The only problem we could face here is that D might not be a function. This will
happen if there is an element r in Z, with r in both S(c;, k) and S(c2, k) for distinct
code words c, co. But r € S(cy, kK) > d(cy, r) < k, and r € S(c2, k) > d(c2, r) <k, so
d(ct, €2) < d(cy,r) + d(r, c2) <k +k < 2k + 1. Consequently, if the minimum distance
between code words is at least 2k + 1, then D 1s a function, and it will decode all possible
768 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
received words, correcting any transmission error of weight < k. Conversely, if cy}, c2 EC
and d(c;, C2) < 2k, then c2 can be obtained from c, by making at most 2k changes. Starting
at code word c; we make approximately half (exactly, |d(c), c2)/2]) of these changes.
This brings us tor = c; + e; with wt(e,) < k. Continuing from r, we make the remaining
changes to get to cz and find r + e2 = c2 with wt(e2) < k. But then r = c2 + e2. Now with
Cc) te; =r =co +e and wt(e,), wt(e2) < k, how can one decide on the code word from
which r arises? This ambiguity results in a possible error of weight <& that cannot be
corrected.
With W = Z5 let E: W + ZS be given by
EXAMPLE 16.24
E00) = 000000 E(10)= 101010 E(01) = 010101 E(11)= 111111.
Then the minimum distance between code words is 3, so we can correct all single errors.
With
$(000000, 1) = {x € Z$|d(000000, x) < 1}
= {000000, 100000, 010000, 001000, 000100, 000010, 000001},
the decoding function D: Z§ > W gives D(x) = 00 for all x € 5(000000, 1).
Similarly,
§(010101, 1) = {x € Z$|d(010101, x) < 1}
= {010101, 110101, 000101, 011101, 010001, 010111, 010100},
and here D(x) = 01 for each x € $(010101, 1). At this point our definition of D accounts
for 14 of the elements in ZS. Continuing to define D for the 14 elements in $(101010, 1) and
S(111111, 1) there remain 36 other elements to account for. We define D(x) = 00 (or any
other message) for these 36 other elements and have a decoding function that will correct
single errors.
Beware! There is a subtle point that needs to be made about Theorems 16.12 and 16.13.
For example, if the minimum distance between code words is 2k + | one may feel that
we can detect all errors of weight < 2k and correct all errors of weight < k. This is not
necessarily true. That is, error detection and error correction need not take place at the same
time and at the maximum levels. To see this, reconsider the (6, 2)-triple repetition code of
Example 16.24. Here the encoding function E: W(= Z3) + ZS is given by E(w)w) =
W1W2W | W2W) Ww and the code comprises the four elements of Zs in the range of F. Since
the minimum distance between any two elements of Zz; is 1, it follows that the minimum
distance between code words is 3 (as observed earlier in Example 16.24).
Now suppose that our major objective is error correction and that r = 100000 [¢ E(W)]
is received. We see that d(000000, r) = 1, d(101010, r) = 2, d(010101, r) = 4, and
d(111111, r) = 5. Consequently, we should choose to decode r as 000000, the unique
code word nearest tor. Unfortunately, suppose that the actual message were 10 (with corre-
sponding code word 101010), but we received r = 100000. Upon correcting r as 000000,
we should then decode 000000 to get the incorrect message 00. And, in so doing, we have
failed to detect an error of weight 2.
In this type of situation one can develop a scheme where a mixed strategy is used. Here
both error correction and error detection may be carried out at some levels.
16.7 The Parity-Check and Generator Matrices 769
For t EN, if the received word is r and there is a unique code word c¢; such that
d(c\, r) <f, then we decode r as c;. (Note: The case where r = c; is covered when t = 0.)
If there exists a second code word c2 such that d(c2, r) = d(c, r), or if d(c, r) > t for all
code words c, then an error is declared (and retransmission is generally requested). Using
this scheme, if the minimum distance between code words is at least 2¢ + 5 + 1, fors EN,
then we can correct all errors of weight <¢ and detect all errors with weights between
t+ 1landt-+-s, inclusive.
When using this scheme for the (6, 2)-triple repetition code, our options include:
1) t = 0; s = 2: Here we can detect all errors of weight <2 but we have no error-
correction capability.
2) t = 1;s = 0: Single errors are corrected here but there is no error-detecting capability.
If we use the (10, 2)-five-times repetition code, then the minimum distance is 5. Applying
the above scheme in this case, our options now include:
1) f = 0; s = 4: Here we can detect all errors of weight <4 but we have no error-
correction capability.
2) t = 1; s = 2: Now single errors are corrected and we can also detect all errors e,
where 2 < wt(e) < 3.
3) tf = 2; s = 0: All errors of weight <2 are corrected but there is no error-detecting
capability.
[For more on this, the interested reader should examine Chapter 4 of the text by S. Roman
[24].]
16.7
The Parity-Check and Generator Matrices
In this section we introduce an example where the encoding and decoding functions are
given by matrices over Z2. One of these matrices will help us to locate the nearest code
word for a given received word. This will be especially helpful as the set C of code words
grows larger.
Let
EXAMPLE 16.25
1
©
—-OO
ore
oor
G= 0
—
1
—_—
Qo
be a3 X 6 matrix over Z). The first three columns of G form the 3 X 3 identity matrix /3.
Letting A denote the matrix formed from the last three columns of G, we write G = [43|A]
to denote its structure. The (partitioned) matrix G is called a generator matrix.
We use G to define an encoding function E: Z3 > Z$ as follows. For w € Z3, E(w) =
wG is the element in ZS obtained by multiplying w, considered as a three-dimensional row
vector, by the matrix G on its right. Unlike the results on matrix multiplication in Chapter 7,
in the calculations here we have 1 + 1 = 0, not 14+ 1 = 1.
(Even if the set W of messages is not all of Z3, we’ ll assume that all of Z; is encoded
and that the transmitter and receiver will both know the real messages of importance and
their corresponding code words.)
770 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
We find here, for example, that
E(110)= (110)G = [110] | 0 = [110101],
—_
©
—
—
and
1 0 0
—_
—
©
_—
E(010) = (010)G = [010] |0 1 0 O 1° 14 = [010011].
0 01 10 1
Note that £ (110) can be obtained by adding the first two rows of G, whereas E(010) is
simply the second row of G.
The set of code words obtained by this method is
C = {000000, 100110, 010011, 001101, 110101, 101011, 011110, 111000} < ZS,
and one can recapture the corresponding message by simply dropping the last three com-
ponents of the code word. In addition, the minimum distance between code words is 3, so
we can detect errors of weight < 2 or correct single errors. (We shall assume that multiple
errors are rare and concentrate on error correction.)
For all w = w, wow; € Z, E(w) = w|W2W3W4Ws5
We E ZS. Since
1 00 1 41 0
E(w) =[wrwow3|}0 1 0 0 1~=«21
00110 1
= [w,w2w3(W) + W3)(wW) + W2)(w2 + w3)],
we have w4 = w, + W3, W5 = w) + wo, We = W2 + wW3, and these equations are called the
parity-check equations. Since w; € Z. for each 1 <i <6, it follows that w; = —w; and so
the equations can be rewritten as
Ww] + W3 + W4 =
Wy, + W2 + Ws =0
W2 + W3 + we
= 0.
Thus we find that
Wy
10110 0 0
1100410 b =H-(E(w))"=]0},
011001 4 0
Ws
We
where (E(w))" denotes the transpose of E(w). Consequently, if r =r,ro--- re € ZS, we
can identify r as a code word if and only if
0
H-r"=10
0
Writing H = [B|J3], we notice that if the rows and columns of 8 are interchanged, then
we get A. Hence B = A".
16.7. The Parity-Check and Generator Matrices 771
From the theory developed earlier on error correction, because the minimum distance
between the code words of this example is 3, we should be able to develop a decoding
function that corrects single errors.
Suppose we receive r = 110110. We want to find the code word c that is the nearest
neighbor of r. If there is a long list of code words against which to check r, we would be
better off to first examine H - r“, which is called the syndrome of r. Here
|
10110 07], 0
Her®=|1 1001 0}/9/=/1],
01100144) 1
0
so r is not a code word. Hence we at least detect an error. Looking back at the list of
code words, we see that d(100110, r) = 1. For all other c € C, d(r, c) > 2. Writing r =
c +e = 100110 + 010000, we find that the transmission error (of weight 1) occurs in the
second component of r. Is it just a coincidence that the syndrome H - r™ produced the
second column of H? If not, then we can use this result in order to realize that if a single
transmission error occurred, it took place at the second component. Changing the second
component of r, we get c; the message w comprises the first three components of c.
Let r = c + e, where c is a code word and e is an error pattern of weight 1. Suppose that
1 is in the ith component of e, where 1 <i <6. Then
H-r“=H-(c+e"=H-(c8
+e") =H-c“ +H -e".
With c a code word, it follows that H -c™ = 0, so H-r" = H - e® = ith column of matrix
H. Thus c and r differ only in the ith component, and we can determine c by simply
changing the 7th component of r.
Since we are primarily concerned with transmissions where multiple errors are rare, this
technique is of definite value. If we ask for more, however, we find ourselves expecting too
much.
Suppose that we receive r = 000111. Computing the syndrome
0
101 1 0 0 5 1
H-r" 1 10 0 1 90 i|> 1],
0 110 0 = 1 1 1
1
we obtain a result that is not one of the columns of H. Yet H-r"™ can be obtained as
the sum of two columns from H. If H -r™ came from the first and sixth columns of H,
correcting these components in r results in the code word 100110. If we sum the third and
fifth columns of H to get this syndrome, upon changing the third and fifth components of
r we get a second code word, 001101. So we cannot expect H to correct multiple errors.
This is no surprise since the minimum distance between code words is 3.
We summarize the results of Example 16.25 for the general situation. For m,n € Z*
with m <n, the encoding function E: Z;' — Z} is given by an m X n matrix G over Zp.
This matrix G is called the generator matrix for the code and has the form [/,,| A], where
772 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
Ais anm X (n — m) matrix. Here E(w) = wG for each message w € Z', and the code
C= E(ZY) CZ.
The associated parity-check matrix H is an (n — m) X n matrix of the form [A"| J,—m]-
This matrix can also be used to define the encoding function F, because 1f w = w,W2--+ Wm
€ Z, then E(w) = wy w2- ++ Wm Wm4i + Wa, Where Wn4t,..., Wa can be determined
from the set of n — m (parity-check) equations that arise from H - (E(w))" = 0, the column
vector of n — m 0’s.
This unique parity-check matrix H also provides a decoding scheme that corrects single
errors in transmission if:
a) H does not contain a column of 0’s. (If the ith column of H had all 0’s and H -r™ = 0
for a received word r, we couldn’t decide whether r was a code word or a received
word whose ith component was incorrectly transmitted. We do not want to compare
r with all code words when C is large.)
b) No two columns of H are the same. (If the th and jth columns of H are the same and
H -r“ equals this repeated column, how would we decide which component of r to
change?)
When H satisfies these two conditions, we get the following decoding algorithm. For
each r € Z), if T(c) =r, then:
1) With H - r™ = 0, we feel that the transmission was correct and that r is the code word
that was transmitted. The decoded message then consists of the first m components
of r.
2) With H - r™ equal to the ith column of H, we feel that there has been a single error
in transmission and change the ith component of r in order to get the code word c.
Here the first 7 components of c yield the original message.
3) If neither case 1 nor case 2 occurs, we feel that there has been more than one trans-
mission error and we cannot provide a reliable way to decode in this situation.
We close with one final comment on the matrix H. If we start with a parity-check matrix
H =[B|I,—m] and use it, as described above, to define the function £, then we obtain
the same set of code words that is generated by the unique associated generator matrix
G = Un|B").
4, Let E: Z3 > Z> be an encoding function where the min-
943 Gh Ae we imum distance between code words is 9. What is the largest
value of k such that we can detect errors of weight < k? If we
1. For Example 16.24, list the elements in $(101010, 1) and wish to correct errors of weight < n, what is the maximum value
S(11111, 1). for n?
5. For each of the following encoding functions, find the
2. Decode each of the following received words for Exam-
minimum distance between the code words. Discuss the error-
ple 16.24.
detecting and error-correcting capabilities of each code.
a) 110101 b) 101011
a) E:Z3 > Z3
ec) 001111 d) 110000 00 -+ 00001 =: 01 + 01010
10> 10100 «11> 11111
3. a) Ifx € Z,°, determine |S(x, 1)|, |S(x, 2), |S(x, 3)].
b) E: Z5 > Z°
b) Forn, k € Z* with 1 <k <n, ifx € Z, what is 00 — 0000000000 01 — 0000011111
|S(x, k)|? 10 — 1111100000 11 —1111111111
16.8 Group Codes: Decoding with Coset Leaders 773
c) E:Z3-> ZS 8. Define the encoding function E: Z} —> Z$ by means of the
000 — 000111 001 — 001001 parity-check matrix
010 + 010010 011 — 011100
0 1 1 0
100 > 100100
>)
101 —> 101010
H= {1 1 0 0 1
110 > 110001 111 — 111000
1 0 1 0 0
d) E:Z3;
+ Z a) Determine al] code words.
000 — 00011111 001 — 00111010
010 + 01010101 011 — 01110000 b) Does this code correct all single errors in transmission?
100 — 10001101 101 — 10101000 9. Find the generator and parity-check matrices for the (9, 8)
110 — 11000100 111 — 11100011 single parity-check coding scheme of Example 16.20.
6. a) Use the parity-check matrix H of Example 16.25 to 10. a) Show thatthe 1 X 9 matrixG ={1 1 1... Iljisthe
decode the following received words. generator matrix for the (9, 1) nine-times repetition code.
i) 111101 ii) 110101 b) What is the associated parity-check matrix H in this
iii) OO1111 iv) 100100 case?
v) 110001 vi) 111111
11. For an (n, m) code C with generator matrix G = {1,,|A]
vii) 111100 viii) 010100
and parity-check matrix H = {A"|J,,_,,], the (n,n — m) code
b) Are all the results in part (a) uniquely determined? C“ with generator matrix [/,_,,|A"] and parity-check matrix
7. The encoding function E: Z3 > Z> is given by the gener- [A|/,] is called the dual code of C. Show that the codes in each
ator matrix of Exercises 9 and 10 constitute a pair of dual codes.
_f1 01 1 0 12. Given n € Z*, let the set M(n, k) C Z5 contain the maxi-
C= E 10 1 | mum number of code words of length n, where the minimum
a) Determine all code words. What can we say about the distance between code words is 2k + 1. Prove that
error-detection capability of this code? What about its error- n an
correction capability? <|M(n, kl < =p
=o (7) eo (")
b) Find the associated parity-check matrix H. (The upper bound on |M(n, k)| is called the Hamming bound;
c) Use H to decode each of the following received words. the lower bound is referred to as the Gilbert bound.)
i) 11011 ii) 10101 iii) 11010
iv) 00111 v) 11101 vi) 00110
16.8
Group Codes:
Decoding with Coset Leaders
Now that we’ve examined some introductory material on coding theory, it is time to see
how the group structure enters the picture.
Definition 16.11 Let E: Z5' + Z), for n > m, be an encoding function. The code C = E(Z%") is called a
group code if C is a subgroup of Z5.
Recall the encoding function E: Z5 > Z$ (of Example 16.24) where
£00) = 000000 £(10) = 101010 E(01)=010101 = E(11)
= 111111.
Here Z3 and Z§ are groups under componentwise addition modulo 2; the subset C =
E(Z3) = {000000, 101010, 010101, 111111} is a subgroup of ZS, and an example of a
group code. (Note that C contains 000000, the zero element of ZS.)
In general when the code words form a group, we find that it is easier to compute the
minimum distance between code words.
774 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
THEOREM 16.14 In a group code, the minimum distance between distinct code words is the minimum of the
weights of the nonzero elements of the code.
Proof: Let a, b, c€ C where a # b, d(a, b) is minimum, and c is nonzero with minimum
weight. By closure in the group C, a + b is a code word. Since d(a, b) = wt(a + b), by
the choice of c we have d(a, b) > wt(c). Also, wt(c) = d(c, 0), where 0 is a code word
because C is a group. Then d(c, 0) > d(a, 5) by the choice of a, b, so wt(c) > d(a, b).
Consequently, d(a, b) = wt(c).
If C is a set of code words and |C| = 1024, we have to compute (1S) = 523,776
distances to find the minimum distance between code words. But if we can recognize that
C possesses a group structure, we need only compute the weights of the 1023 nonzero
elements of C.
Is there some way to guarantee that the code words form a group? By Theorem 16.5(d),
the homomorphic image of a subgroup is a subgroup, so if E: Z5' — Z} is a group homo-
morphism, then C = E(Z5') will be a subgroup of Z;. Our next result will use this fact to
show that the codes we obtain when using a generator matrix G or a parity-check matrix H
are group codes. Furthermore, the proof of this result reconfirms the observation we made
(at the end of the previous section) about the code that arises from a generator matrix G or
its associated parity-check matrix H.
THEOREM 16.15 Let E: Z}' > Z) be an encoding function given by a generator matrix G or the associated
parity-check matrix H. Then C = E(Z?') is a group code.
Proof: We establish these results by proving that the function F arising from G or H isa
group homomorphism.
Ifx, ye Zy, then E(x+y)=&%+y)G =xG+yG = E(x) + E(y). Hence E isa
homomorphism and C = E(Z;') is a group code [by virtue of part (d) of Theorem 16.5].
For the case of H, ifx is a message, then E(x) = x1X2-++XmXm41°°* Xp, Where x =
XjX2+++Xpm € ZY and H - (E(x))" =0. In particular, E(x) is uniquely determined by
these two properties. If y is also a message, then x + y is likewise, and E(x + y) has
(x) + yt), (¥2 + y2),.--, (4m + Ym) as its first m components, as does E(x) + E(y). Fur-
ther, H- (E(x) + E(y))" =A -(E(x)"4+ E(y)") = A- E(x)" + H- E(y)" =04+0=
0. Since E(x + y) is the unique element of Z with (x; + y1), (x2 + y2),..-. (im + Ym)
as its first m components and with H - (E(x + y))" =0, it follows that E(x + y) =
E(x) + E(y). So E is a group homomorphism and, consequently, C = {ce € Z5| H +c"
= 0} is a group code.
Now we use the group structure of C, together with its cosets in Z5, to develop a scheme
for decoding. Our example uses the code developed in Example 16.25, but the procedure
applies for every group code.
We develop a table for decoding as follows.
EXAMPLE 16.26
1) First list in a row the elements of the group code C, starting with the identity.
000000 100110 010011 OO1101 110101 101011 011110 111000.
2) Next select an element x of Zz (Z,, in general) where x does not appear anywhere
in the table developed so far and has minimum weight. Then list the elements of the
16.8 Group Codes: Decoding with Coset Leaders 7715
coset x + C, with x + c directly below c for each c € C. For x = 100000 we have
000000 100110 O10011 001101 110101 101011 011110 111000
100000 000110 110011 101101 010101 OO1011 111110 011000.
3) Repeat step (2) until the cosets provide a partition of ZS (Z5, in general). This results
in the decoding table shown in Table 16.8.
4) Once the decoding table is constructed, for each received word r we find the column
containing r and use the first three components of the code word c at the top of the
column to decode r.
Table 16.8 Decoding Table for the Code of Example 16.25
000000 100110 010011 001101 110101 101011 011110 111000
100000 O00110 110011 101101 010101 001011 111110 011000
010000 110110 Q00011 O11101 100101 111011 001110 101000
001000 101110 011011 000101 111101 100011 010110 110000
000100 100010 Q10111 001001 110001 101111 011010 111100
000010 100100 010001 OO1111 110111 101001 O11100 111010
Q00001 100111 010010 O01100 110100 101010 O11111 ~~ 111001
010100 110010 OO011! 011001 100001 111111 001010 101100
From the table we find that the code words for the received words
r; = 101001 ro = 111010 r3; = 001001 rg = 111011
are
c, = 101011 c2 = 111000 c3 = 001101 cq = 101011,
respectively. From these results the respective messages are
w, = 101 w2 = 1l1l w3 = 001 w4
= 101.
The entries in the first column of Table 16.8 are called the coset leaders. For the first
seven rows, the coset leaders are the same in all tables, with some permutations of rows
possible. However, for the last row, either 100001 or 001010 could have been used in place
of 010100 because they also have minimum weight 2. So the table need not be unique.
[As a result, not all double errors can be corrected because there may not be a unique code
word at a minimum distance for each r in the last coset (the one with coset leader 010100).
For example, r = 001010 has three closest code words (at distance 2) — namely, 000000,
101011, and 011110.]
How do the coset leaders really help us? It seems that the code words in the first row are
what we used to decode r), r2, r3, and rg above.
Consider the received words r; = 101001 and r2 = 111010 in the sixth row, where the
coset leader is x = 000010. Computing syndromes, we find that
0
H-(n)"=]1)=H-(o)" = Hex",
0
This is not just a coincidence.
776 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
THEOREM 16.16 Let C C Z; be a group code for a parity-check matrix H, and let r;, r2 € Z5. For the table
of cosets ofC in Z5, r; andr are in the same coset of C if and only if H - (r;)" = A - (72)".
Proof: If r; and r2 are in the same coset, then r; = x +c, and ro = x +c, where x is
the coset leader, and c; and cz are the code words at the tops of the respective columns
for r; and r7. Then H- (r,)"= A-(xt+e))"=A-x"+H-clh =H-x"+0=H-x"
because c; is a code word. Likewise, H - (r2)" = H - x", sor), r2 have the same syndrome.
Conversely, H -(r))" = H-(n)"S A- (7, +1)" =05 rn, + rm isacode word c. Hence
ry tro =c,sor) =r2+candr, €r24+C. Since ro €r2 + C, we have r,, r> in the same
coset.
In decoding received words, when Table 16.8 is used we must search through 64 ele-
ments to find a given received word. For C © Zz there are 4096 strings, each with 12 bits.
Such a searching process is tedious, so perhaps we should be thinking about having a
computer do the searching. Presently it appears that this means storing the entire table:
6 X 64 = 384 bits of storage for Table 16.8; 12 X 4096 = 49,152 bits for C C Zz. We
should like to improve this situation. Before things get better, however, they’1] look worse
as we enlarge Table 16.8, as shown in Table 16.9. This new table includes to the left of the
coset leaders (the transposes of ) the syndromes for each row.
Table 16.9 Decoding Table 16.8 with Syndromes
000 Q00000 100110 O10011 OO1101 110101 101011 011110 111000
110 100000 000110 110011 101101 010101 001011 111110 011000
011 010000 110110 O00011 011101 100101 111011 001110 101000
101. 901000 =101110 011011 000101 111101 100011 010110 110000
100 000100 100010 010111 001001 110001 101111 011010 111100
010 000010 100100 010001 001111 110111 101001 011100 111010
001 900001 100111 010010 001100 110100 101010 011111 ~~ 111001
111 O10100 110010 OO0111 011001 100001 111111 001010 101100
Now we can decode a received word r by the following procedure.
1) Compute the syndrome H -r".
2) Find the coset leader x to the right of H -r™.
3) Add x to r to get c. (The code word c that we are seeking at the top of the column
containing r satisfies c+ x =r,orc =x +r.)
Consequently, all that is needed from Table 16.9 are the first two columns, which will
require (3)(8) + (6)(8) = 72 storage bits. With 18 more storage bits for H we can store
what we need for this decoding process, called decoding by coset leaders, in 90 storage
bits, as opposed to the original estimate of 384 bits.
Applying this procedure to r = 110110, we find the syndrome
0
H-r"= 1
l
Since 011 is to the left of the coset leader x = 010000, the code word c =x +r =
010000 + 110110 = 100110, from which we recapture the original message, 100.
16.9 Hamming Matrices 777
The code here is a group code where the minimum weight of the nonzero code words
is 3, So we expected to be able to find a decoding scheme that corrected single errors. Here
this is accomplished because the error patterns of weight 1 are all coset leaders. We cannot
correct all double errors; only one error pattern of weight 2 is a coset leader. All error pat-
terns of weight 1 or 2 would have to be coset leaders before our decoding scheme could
correct both single and double errors in transmission.
Unlike the situation in Example 16.25, where syndromes were also used for decoding,
things here are a bit different. Once we have a complete table listing all of the cosets of C in
Z5, the process of decoding by coset leaders will give us an answer for all received words,
not just for those that are code words or have syndromes that appear among the columns of
the parity-check matrix H. However, we do realize that there is still a problem here because
the last row of our table is not unique. Nonetheless, as our last result will affirm, this method
provides a decoding scheme that is as good as any other.
THEOREM 16.17 When we are decoding by coset leaders, if r € Z5 is a received word and r is decoded as
the code word c* (which we then decode to recapture the message), then d(c*, r) < d(c, r)
for all code words c.
Proof: Let x be the coset leader for the coset containing r. Then r = c* + x, orr+c* =x,
so d(c*, r) = wt(r + c*) = wt(x). If ¢ is any code word, then d(c, r) = wt(c +r), and
we have c+r=c+(ce*+.x) = (c+c*) +x. Since C is a group code, it follows that
c+c*eéC andsoc +r is inthe coset x + C. Among the elements in the coset x + C, the
coset leader x is chosen to have minimum weight, so wt(c + r) > wt(x). Consequently,
d(c*, r) = wt(x) < wt(e +r) =d(c, r).
16.9
Hamming Matrices
We found the parity-check matrix H helpful in correcting single errors in transmission
when (a) H had no column of 0’s and (b) no two columns of H were the same. For the
matrix
1101 1 0 0
H=/]1
01 10 1 0
0 111 00 1
we find that H satisfies these two conditions and that for the number of rows (r = 3) in H
we have the maximum number of columns possible. If an additional column is added, H
will no longer be useful for correcting single errors.
The generator matrix G associated with H is
10001 1 0
gu/9 190101
0010011
0001111
Consequently we have a (7, 4) group code. The encoding function F: Zs — Z} encodes
four-bit messages into seven-bit code words. We realize that because H is determined by
three parity-check equations, we have now maximized the number of bits we can have in
778 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
the messages (under our present coding scheme). In addition, the columns of H, read from
top to bottom, are the binary equivalents of the integers from 1 to 7.
In general, if we start with r parity-check equations, then the parity-check matrix H
can have as many as 2’ — 1 columns and still be used to correct single errors. Under these
circumstances H = [B| J,], where B isanr X (2’ — 1 —r) matrix, and G = [J,| B™] with
m = 2' — 1 —r. The parity-check matrix H associated with a (2" — 1, 2" — 1 — r) group
code in this way is called a Hamming matrix, and the code is referred to as a Hamming
code.
If r = 4, then 2” —1 = 15 and 2’ —1—r=11. The one (up to a permutation of the
EXAMPLE 16.27
columns) possible Hamming matrix H for r = 4 is
11 117 71 1:00 00 1 0 0 0
1 1 110001 1 10 01 0 0
11001 1071 101 00 1 0
1010 101101 1 00 0~«1
Once again, the columns of H contain the binary equivalents of the integers from 1 to 15
(= 27 — 1).
This matrix H is the parity-check matrix of a Hamming (15, 11) code whose rate is
11/15.
With regard to the rate of these Hamming codes, for all r > 2, the rate m/n of such a code
is given by m/n = (2 — 1 —r)/(2" — 1) = 1 — [r/(2” — 1)]. As r increases, r/(2" — 1)
goes to 0 and the rate approaches 1.
We close our discussion on coding theory with one final observation. In Section 16.7 we
presented G (and #7) in what is called the systematic form. Other arrangements of the rows
and columns of these matrices are also possible, and these yield equivalent codes. (More
on this can be found in the text by L. L. Dornhoff and F. E. Hohn [4].) We mention this here
because it is often common practice to list the columns in a Hamming matrix of r rows so
that the binary representations of the integers from 1 to 2" — 1 appear as the columns of H
are read from left to right. For the Hamming (7, 4) code, the matrix H mentioned at the
start of this section would take the (equivalent) form
0001 1 1
A=;}0
1 10 0 1
101010 1
Here the identity appears in the first, second, and fourth columns instead of in the last three.
Consequently, we would use these components for the parity checks and find that if we send
the message w = Ww) wW2W3w4, then the corresponding code word E(w) is c)c2W1c3W2W3W4,
where
Cc) = w+ wo + wW4
C2 = Ww + w3
+ w4
C3 = W2 + w3+ w4,
so that H; - (E(w))" = 0.
In particular, if we send the message w = w,)w2w3w4 = 1010, the corresponding code
word would be E(w) = c¢ = c1C2W1C3W2W3W4 = C)C21cC3010, where cy = w,; + w2+
16.10 Counting and Equivalence: Burnside’s Theorem 779
wa =1+04+0=1, mo =v, +u3+ uw, =14+1+0=0, and c3 =u.+ 03+ 04 =
0+1+0=1.Thenc = 1011010 and H, - (E(w))" = H, - (E(1010))" =
A, - (1011010)" = 0. (Verify this!) So ifc = 1011010 is sent but r = 1001010 is received,
we have H, -r“ = H, - (1001010)" = (011)". (Verify this as well!) Since 011 is the binary
representation for 3 we know that the error is in position 3 — and this time we did not have
to examine the columns of H;. So using a parity-check matrix of the form H, simplifies
syndrome decoding. In general, for ¢ = c)c2W | c3W2W3wW4, letr = c + e, where e is an error
pattern of weight 1. And suppose that the 1 in ¢ is in position i, where 1 <i <7. Then the
syndrome H, - r" provides the binary representation for i and we can determine c without
examining the columns of H,. From the third, fifth, sixth, and seventh components of c we
can then recapture the original message w’.
a) Encode the following messages:
AT eh SCR Rm)
1000 1100 1011 1110 1001 1111.
1. Let E: Z3 > Z,” be the encoding function fora code C. How b) Decode the following received words:
many calculations are needed to find the minimum distance be-
1100001 1110111 0010001 0011100.
tween code words? How many calculations are needed if E is
a group homomorphism? c) Construct a decoding table consisting of the syndromes
2. a) Use Table 16.9 to decode the following received words. and coset leaders for this code.
000011 100011 111110 100001 d) Use the result in part (c) to decode the received words
given in part (b).
001100 011110 001111 111100
5. a) What are the dimensions of the generator matrix for the
b) Do any of the results in part (a) change if a different set Hamming (63, 57) code? What are the dimensions for the
of coset leaders is used? associated parity-check matrix H?
3. a) Construct a decoding table (with syndromes) for the b) What is the rate of this code?
group code given by the generator matrix
6. Compare the rates of the Hamming (7, 4) code and the
1011 0 (3, 1) triple-repetition code.
g=|o onal
b) Use the table from part (a) to decode the following re- 7. a) Let p = 0.01 be the probability of incorrect transmission
ceived words. for a binary symmetric channel. If the message 1011 is sent
11110 11101 11011 10100 via the Hamming (7, 4) code, what is the probability of cor-
10011 10101 11111 01100 rect decoding?
c) Does this code correct single errors in transmission? b) Answer part (a) for a 20-bit message sent in five blocks
4. Let of length 4.
1 1 0 1 1 0 O
A= 101 1 0 1 0
01 1 1 40 0 1
be the parity-check matrix for a Hamming (7, 4) code.
16.10
Counting and Equivalence:
Burnside’s Theorem
In this section and the next two we shall develop a counting technique known as Polya’s
Method of Enumeration. Our development will not be very rigorous. Often we shall only
state the general results of the theory as seen in the solution of a specific problem. Our first
encounter with the type of problem to which this counting technique applies is presented
in the following example.
780 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
We have a set of sticks, all of the same length and color, and a second set of round plastic
EXAMPLE 16.28
disks. Each disk contains two holes, as shown in Fig. 16.4, into which the sticks can be
inserted in order to form different shapes, such as a square. (See Fig. 16.5.) If each disk is
either red or white, how many distinct squares can we form?
> |
D4 | ,)
Figure 16.4 Cy G C3 Cy Cs
cf (2)
a [|
Cg Cg C10 Cy
cf (4)
C14 Cis C16
c€(5) cf (6)
Figure 16.5
If the square is considered stationary, then the four disks are located at four distinct
locations; a red or white disk is used at each location. Thus there are 2+ = 16 different
configurations, as shown in Fig. 16.5, where a dark circle indicates a red disk. The config-
urations have been split into six classes, c£(1), c£(2), ... , c€(6), according to the number
and relative location of the red disks.
Now suppose that the square is not fixed but can be moved about in space. Unless the ver-
tices (disks) are marked somehow, certain configurations in Fig. 16.5 are indistinguishable
when we move them about.
To place these notions in a more mathematical setting, we use the nonabelian group
of three-dimensional rigid motions of a square to define an equivalence relation on the
configurations in Fig. 16.5. Since this group will be used throughout this section and the
next two sections, we now give a detailed description of its elements.
In Fig. 16.6 we have the group G = {7o, 71, 72, 73,11, 2, 73, 74} for the rigid mo-
tions of the square in part (a), where we have labeled the vertices with 1, 2, 3, and 4. Parts (b)
through (i) of the figure show how each element of G is applied. We have expressed each
group element as a permutation of {1, 2, 3, 4} and in a new form called a product of disjoint
cycles. For example, in part (b) we find 2; = (1234). The cycle (1234) indicates that if we
start with the square in part (a), after applying 7, we find that 1 has moved to the position
originally occupied by 2, 2 to that of 3, 3 to that of 4, and 4 to that of 1. In general, if xy
appears in a cycle, then x moves to the position originally occupied by y. Also, for a cycle
where x and y appear as (x ... y), y moves to the position originally occupied by x when the
motion described by this cycle is applied. Note that (1234) = (2341) = (3412) = (4123).
We say that each of these cycles has length 4, the number of elements in the cycle. In the
case of r; in part (f) of the figure, starting with 1 we find that r; sends 1 to 4, so we have
16.10 Counting and Equivalence: Burnside’s Theorem 781
1 2 4 1 3 4
»Y -)
4 3 3 2 2
Starting position of Clockwise rotation Clockwise rotation
the square through 90° through 180°
7 = 334) = (1234) T= 335) = (13)(24)
(a) (b) (c)
2 3 1 2 4 3
1 4 4 3 1 2
Clockwise rotation Clockwise rotation Reflection in the horizontal
through 270° through 360°
m3 = (1234)
4123
= (1432) mq = (1234)
1234 = (1)(2)(3)(4) r, = (1234)
4321 = (14)(23)
(d) (e) op)
2! 1 3 / . 4
| 7 2 TT.
| 7 NX
$ s *
7 XN
| Yo \,
Al N 3
3 4 é 1 2 ‘
Reflection in the vertical Reflection in the diagonal Reflection in the diagonal
through vertices 2 and 4 through vertices 1 and 3
ry == (1234) = (12)(34)
(1234) _ — (1234) _
ly — (5739?
(1234) _
(1}(24)(3)
(g) (h) (i)
Figure 16.6
(14...) as the start of our first cycle in this decomposition of r;. However, here r; sends
4 to 1, so we have completed a portion— namely, (14) — of the complete decomposition.
We then select a vertex that has not yet appeared— for example, vertex 2. Since r; sends
2 to 3 and 3, in turn, to 2, we get a second cycle (23). This exhausts all vertices and so
(14)(23) = r1, where the cycles (14) and (23) have no vertex in common. Here (14)(23) =
(23)(14) = (23)(41) = (32)(41) all provide a representation of r; as a product of disjoint
cycles, each of length 2. Last, for the group element r3 = (13)(2)(4), the cycle (2) indicates
that 2 is fixed, or invariant, under the permutation r3. When the number of vertices involved
is known, the permutation r3 may also be written as r; = (13), where the missing elements
are understood to be fixed. However, we shall write all of the cycles in our decompositions,
for this will be useful later in our discussion.
Before continuing with the main discussion concerning the disks and sticks, let us ex-
amine some further results on disjoint cycles.
. . _f{1 2 3 4 5 6
Inthe group Sof all permutations of (1, 2,3, 4,5, 6) let = ( 5 3 14 6 s )
As a product of disjoint cycles,
mw = (123)(4)(56) = (56)(4)(123) = (4)(231)(65).
782 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
. _f1 2 3 4 5 6
Io © Ss, with o = ( 45 1 6 5 ) then
2
123 4 5 6\/1 23 45 6
= (124)(356) =
o=(12935)=() 7 312 lias ae a)
so each cycle can be thought of as an element of So.
Finally, ifa = (124)(3)(56) and B = (13)(245)(6) are elements of S¢, then
ap = (124)(3)(56)(13)
(245) (6) = (143) (256),
whereas
Boa = (13)(245)(6)
(124) (3) (56) = (132) (465).
Returning to the 16 configurations, or colorings, in Fig. 16.5, we now examine how
each element in the group G, in Fig. 16.6, acts upon these configurations. For example,
4 . ;
= ( P23 ) permutes the numbers {1, 2, 3, 4} according to a 90° clockwise
23 4 1
rotation for the square in Fig. 16.6(a), yielding the result in Fig. 16.6(b). How does such
a rotation act on S$ = {C;, C2,..., Cie}, our set of colorings? We use 1 to distinguish
between the 90° clockwise rotation for {1, 2, 3, 4} and the same rotation when applied to
S={C,, Co,..., Ci6}. We find that
nt (C Cr Cx Ca Cs Ce C7 Cg Co Cio Cr Ci2 Ci3 Cia Cis on)
C, C3 C4 Cs Co C7 Cg Co Co Cur Cro C13 Cia Cis Cir Cio)
As a product of disjoint cycles,
my = (C))(CxC3CaCs)
(CoC7CgCo) (CoC 11 (C12C13C 1415) (C6).
We note that under the action of mi, no configuration is changed into one that is in another
class.
As a second example, consider the reflection r3 in Fig. 16.6(h). The action of this rigid
motion on S is given by
——_ (¢ C2 C3 Cy Cs Cg C7 Cg Co Cro Cri Ci2 Ci3 Cia Cis on)
3 C, Cy Cs Cy C3 C7 Co Co Cg Cro Cir Cig Ci3 Ci2 Cis Cie
= (C))(C2)(C3C5) (C4) (CoC7) (Cg Co) (Cio) (Ci (C12 14) (C13) (C15) (Cie).
Once again, no configuration is taken by r} into one that is outside the class that it was in
originally.
Using the idea of the group G acting on the set S, we define a relation ® on S as follows.
For colorings C;, C; € S, where 1 <i, j < 16, we write C; % C; if there is a permutation
o €G such that o*(C;) = C;. That is, as o* acts on the 16 configurations in S, C; is
transformed into C;. This relation & is an equivalence relation, as we now verify.
a) (Reflexive Property) For all C; € S, where 1 <i < 16, it follows that C, R C; because
G contains the identity permutation. [23 (C,) = C; for all 1 <i < 16.]
b) (Symmetric Property) If C; & C, for C;, C, € S, then o*(C;) = C,, for some a € G.
G is a group, so a7! € G, and we find that (o*)~' = (0 ~')*. (Verify this for two
choices of ¢ € G.) Hence C, = (a ~')* (Cj), and C; RC.
16.10 Counting and Equivalence: Burnside’s Theorem 783
c) (Transitive Property) Let C,,C,, C, € S with C; RC; and Cj; RC,. Then Cj =
o*(C;) and C, = t*(C;), for some o, t € G. By closure in G, ot € G, and we find
that (ot)* = a*t*, where a is applied first inot and o® first ino*t*. (Verify this for
two specific permutations o, t € G.) Then C, = (ot)*(C;) and & is transitive. [The
reader may have noticed that C, = t*(C;) = t*(o *(C;)) and felt that we should have
written (ot)* = t*o*. Once again, there has been a change in the notation for the
composite function as we first defined it in Chapter 5. Here we write o*t* for (ot)*,
and o” is applied first.]
Since & is an equivalence relation on S, & partitions S into equivalence classes, which
are precisely the classes cé(1), c€(2), ..., c&(6) of Fig. 16.5. Consequently, there are six
nonequivalent configurations under the group action. So among the original 16 colorings
only 6 are really distinct.
What has happened in this example generalizes as follows. With S a set of configurations,
let G be a group (of permutations) that acts on S. If the relation & is defined on S by x R y
if 7*(x) = y, for some 7 € G, then &% is an equivalence relation.
With only red and white disks to connect the sticks, the answer to this example could
have been determined from the results in Fig. 16.5. However, we developed quite a bit of
mathematical overkill to answer the question. Referring to S as the set of 2-colorings of
the vertices of a square, we start to wonder about the role of 2 and seek the number of
nonequivalent configurations if the disks come in three or more colors.
In addition, we might notice that the function f(r, w) = r+ + rw + 2r?w? 4+ rw? 4+ wt
is the generating function (of two variables) for the number of nonequivalent configura-
tions from S. Here the coefficient of r'w*~', for 0 <i <4, yields the number of distinct
2-colorings that have i red disks and (4 — 7) white ones. The coefficient of r7w* is 2 be-
cause of the two equivalence classes c£(3) and c€(4). Finally, f(1, 1) = 6, the number of
equivalence classes. This generating function f(r, w) is called the pattern inventory for the
configurations. We shall examine it in more detail in the next two sections.
For now we record an extended version of our present results in the following theorem.
(A proof of this result is given on pages 136-137 of C. L. Liu [17].)
THEOREM 16.18 Burnside’s Theorem. Let S be a set of configurations on which a finite group G of permu-
tations acts. The number of equivalence classes into which S is partitioned by the action of
G is then given by
ia Y> vir*),
| *
wEG
where 1 (z*) is the number of configurations in S fixed under z*.
To better accept the validity of this theorem, we first examine two examples where we
already know the answers.
In Example 16.28 we find that Ww (7) = 2 because only C; and Cj¢ are fixed, or invariant,
| EXAMPLE 16.29 under 77;*. For r3 € G, however, y(r;) = 8 because Cy, C2, Ca, Cio, Cri, C13, Cis, and Cie
remain fixed under this group action. In like manner w (s73') = 4, w(x) = 2, w(ag) = 16,
784 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
wy) = wry) = 4, and w(rf) = 8. With |G| = 8, Burnside’s Theorem implies that the
number of equivalence classes, or nonequivalent configurations, is
(1/8)16+2+44+24+444+48
+ 8) = (1/8)(48) =6,
the original answer.
In how many ways can six people be arranged around a circular table if two arrangements
EXAMPLE 16.30
are considered equivalent when one can be obtained from the other by means of a clockwise
rotation through i - 60°, forO <i <5?
Here the six distinct people are to be placed in six chairs located at a table, as shown in
Fig. 16.7. Our permutation group G consists of the clockwise rotations 7, through i - 60°,
6 2 where 0 <i < 5. Here reflections are not meaningful. The situation is two-dimensional, for
we can rotate the circle (representing the table) only in the plane; the circle never lifts off
the plane. The total number of possible configurations is 6! We find that y(zj) = 6! and
5 3 that y(27*) = 0, for 1 <i <5. (It’s impossible to move different people and simultaneously
have them stay in a fixed location.)
4
Consequently, the total number of nonequivalent seating arrangements is
Figure 16.7
(=) > wo*) = (z) (6'+0+0+0+0+0)=5!,
aeG
as we found in Example 1.16 of Chapter 1.
We now examine a situation where the power of this theorem is made apparent.
In how many ways can the vertices of a square be 3-colored, if the square can be moved
EXAMPLE 16.31
about in three dimensions?
Now we have the sticks of Example 16.28, along with red, white, and blue disks. Con-
sidering the group in Fig. 16.6, we find the following:
w (ag) = 34, because the identity fixes all 81 configurations in the set S of possible
configurations.
(2*)7 = W(2x*)
3 = 3, foreach of z*,1 > 2*783 leaves invariant only y those configurations
g with
all vertices the same color.
w (>) = 9, for 2 can fix only those configurations where the opposite (diagonally)
vertices have the same color. Consider a square like the one shown in Fig. 16.8. There
are three choices for placing a colored disk at vertex 1 and then one choice for matching
jt at vertex 3. Likewise, there are three choices for colors at vertex 2 and then one for
vertex 4. Consequently, there are nine configurations invariant under 7}.
wrt) = w(r}) = 9. In the case of r*, for the square shown in Fig. 16.8 we have three
choices for coloring each of the vertices | and 2, and then we must match the color of
4 3 vertex 4 with the color of vertex 1, and the color of vertex 3 with that of vertex 2.
Figure 16.8 Finally, w (rz j= wry ) = 27. For rz , we have nine choices for coloring the two vertices
at 2 and 4, and three choices for vertex |. Then there is only one choice for vertex 3
because we must match the color of vertex |.
By Burnside’s Theorem, the number of nonequivalent configurations is
(1/8)(34 +3437 +3437 + 37 +3? +33) =21.
16.11 The Cycle Index 785
a) How many distinct paintings can be made if there are
EXERCISES 16.10 three colors of paint available? How many for four colors?
b) Answer part (a) for batons with four cylindrical bands.
1. Consider the configurations shown in Fig. 16.5.
c) Answer part (a) for batons with n cylindrical bands.
a) Determine 73*, 773°, r*, and r>.
d) Answer parts (a) and (b) if adjacent cylindrical bands
b) Verify that (7')* = (jt)! and (ry ')* = OF)!
are to have different colors.
c) Verify that (yry)* = aftr and (rarq)* = ayrf.
9, In how many ways can we 2-color the vertices of the con-
2. Express each of the following elements of S; as a product figurations shown in Fig. 16.9 if they are free to move in (a) two
of disjoint cycles. dimensions? (b) three dimensions?
—f1234567
““\o 467153
p-(1 2345 6 7
3652174
(1234567
Y“\o 3-175 4 6 Figure 16.9
5-(1 234567
10. A pyramid has a square base and four faces that are equi-
4271365
lateral triangles. If we can move the pyramid about (in three
3. a) Determine the order of each of the elements in Exer-
dimensions), how many nonequivalent ways are there to paint
cise 2,
its five faces if we have paint of four different colors? How
b) State a general result about the order of an element in many if the color of the base must be different from the color(s)
S, in terms of the lengths of the cycles in its decomposition of the triangular faces?
as a product of disjoint cycles.
11. a) In how many ways can we paint the cells of a 3 x 3
4, a) Determine the number of distinct ways one can color the
chessboard using red and blue paint? (The back of the chess-
vertices of an equilateral triangle using the colors red and
board is black.)
white, if the triangle is free to move in three dimensions.
b) In how many ways can we construct a 3 X 3 chess-
b) Answer part (a) if the color blue is also available.
board by joining (with paste) the edges of nine 1 X 1 plastic
5. Answer the questions in Exercise 4 for a regular pentagon. squares that are transparent and tinted red or blue? (There
6. a) How many distinct ways are there to paint the edges of are nine squares of each color available.)
a square with three different colors? 12. Answer Exercise 11 fora 4 X 4 chessboard. [Replace each
b) Answer part (a) for the edges of a regular pentagon. “nine” in part (b) with “sixteen.”’]
7, We make a child’s bracelet by symmetrically placing four 13. In how many ways can we paint the seven (identical) horses
beads about a circular wire. The colors of the beads are red, on a carousel using black, brown, and white paint?
white, blue, and green, and there are at least four beads of each
14, a) Let S bea set of configurations and G a group of permu-
color. (a) How many distinct bracelets can we make in this
tations that acts on S. Ifx € S, prove that {7 € G|x*(x) =
way, if the bracelets can be rotated but not reflected? (b) Answer
x} is a subgroup of G (called the stabilizer of x).
part (a) if the bracelets can be rotated and reflected.
b) Determine the respective stabilizer subgroups in part (a)
8. A baton is painted with three cylindrical bands of color (not
for each of the configurations C7 and Cs in Fig. 16.5.
necessarily distinct), with each band of the same length.
16.11
The Cycle Index
In applying Burnside’s Theorem we have been faced with computing y(*) for each
x € G, where G is a permutation group acting on a set S of configurations. As the number
of available colors increases and the configurations get more complex, such computations
can get a bit involved. In addition, it seems that if we can determine the number of 2-
colorings for a set S of configurations, we should be able to use some of the work in this
case to determine the number of 3-colorings, 4-colorings, and so on. We shall now find
786 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
some assistance as we return to the solution of Example 16.28. This time more attention
will be paid to the representation of each permutation z € G as a product of disjoint cycles.
Our results are summarized in Table 16.10.
Table 16.10
Cycle
Structure
Configurations Represen-
Rigid Motions z in S that Are tation Inventory of Configurations that Are
(Elements of G) | Invariant under 2* of zx Invariant under x*
mo = (1)(2)(3)(4) | 2*: All configurations xt (r+w) =ri44w 4 6r?2w?4+4rw? + wt
in S$
m1 = (1234) 2:C1, Cis X4 r++ w4 = 74 + w*
m, = (13)(24) 27: C1, Cw, Cur, Cie x3 (r? + w?)? =r4 + 2r?y? + wt
3 = (1432) 2:C1, Cie x4 rét+w4 =r! + y*
r) = (14)(23) 2?. Cy), C7, Co, Cre x5 (r? + wy? =r + 2r2w? + yy
ro = (12)(34) 27: C1, Co, Cg, Cre x5 (r* + w?)? =/4 + 2r?y? + wy*
rs = (13)(2)(4) 23: Cy, Co, Ca, Cyo, XoXT (re+wi)(rt+ wy =rt+ 2rw st 2r2w? + 2ru? + wt
Ci, Cr, Cis, Cte
rg = (1)(24)(3) 23: Cy, C3, Cs, Cio, XX? (24+ w%7¢4+ wy? =r4*4 Iw t 2r?2w? + 2rw? + w4
Ci, Ci, Cia, Cre
Pg (X1, X2, X3, X4) = Complete = 8r°4 + 8rew
3 + lOr-we
2,2 + 8rw° 3 + Bw 4
g(X] + 2xq + 3xz + 2x9x7) Inventory
For zo, the identity of G, we write 7 = (1)(2)(3)(4), a product of four disjoint cycles.
We shall represent this cycle structure algebraically by x}, where x; indicates a cycle of
length 1. The term x} is called the cycle structure representation of mo. Here we interpret
“disjoint” as “independent,” in the sense that whatever color is used to paint the vertices in
one cycle has no bearing on the choice of color for the vertices in another cycle. As long
as all the vertices in a given cycle have the same color, we shall find configurations that
are invariant under 7. (Admittedly, this seems like mathematical overkill again, inasmuch
as mq fixes all 2-colorings of the square.) In addition, since we can paint the vertices in
each cycle either red or white, we have 2* configurations, and we find that (r + w)* =
r444r3w + 6r?w? + 4rw? + w* generates these 16 configurations. For example, from
the term 6r w? we find that there are six configurations with two red and two white vertices,
as found in classes c£(3) and c£(4) of Fig. 16.5.
Turning to z,, we find 2; = (1234), acycle of length 4. This cycle structure is represented
by x4, and here there are only two invariant configurations. The fact that the cycle structure
for x; has only one cycle tells us that for a configuration to be invariant under *, every
vertex in this cycle must be painted the same color. With two colors to choose from, there
are only two possible configurations, C and Cj¢. In this case the term r+ + w* generates
these configurations.
Continuing with r;, we have r; = (14)(23), a product of two disjoint cycles of length
2; the term x5 represents this cycle structure. For a configuration to be invariant under r**,
the vertices at 2 and 3 must be the same color; that is, we have two choices for coloring the
16.11 The Cycle Index 787
vertices in (23). We also have two choices for coloring the vertices in (14). Consequently, we
get 2? invariant configurations: C,(r*), C7(r?w’), Co(r?w), and Cye(w*). [(r? + w?)? =
r+ 2r?w? + wt]
Finally, in the case of rz; = (13)(2)(4), we find that xQXxP indicates its decomposition into
one cycle of length 2 and two of length 1. The vertices at 1 and 3 must be painted the same
color if the configuration is to be invariant under rj. With three cycles and two choices
of color for each cycle, we find 2? invariant configurations. They are C,(r*), C2(r?w),
Cy(r3w), Cio(r2w?), Ci (r?w?), Ci3(rw), C1s(rw>), and C)6(w*). These configurations
are generated by (r? + w’)(r + w)’, for when we consider the cycle (13) we have two
choices: both vertices red (r2) or both vertices white (w”). This gives us r* + w*. For
each single vertex in the two cycles of length 1, r + w provides the choices for each cycle,
(r + w)* the choices for the two. By the independence of choice of colors as we go from
one cycle to another, (r? + w?)(r + w)? generates the 2° configurations that are invariant
under rj.
Similar arguments provide the information in Table 16.10 for the permutations 72, 73,
ro, and r 4.
At this point we see that what determines the number of configurations that are invariant
under x*, for x € G, depends on the cycle structure of 2. Within each cycle the same color
must be used, but that color can be selected from the two or more choices made available.
For 7}, we had two cycles (of length 2) and 2? configurations. If three colors had been
available, the number of invariant configurations would have been 3°. For m colors, the
number is m*. Adding these terms for all the cycle structures that arise gives } >. -g W(a").
We now wish to place more emphasis on cycle structures, so we define the cycle index,
Pg, for the group G (of permutations) as
I
Po (%1, x2, X3, X4) = iG] > (cycle structure representation of 7).
xEG
In this example,
Po (x1, X2, X3, X4) = (1/8) (xp + 2x4 + 3x5 + 2x0x7).
When each occurrence of x1, x2, x3, x4 is replaced by 2, we find that the number of non-
equivalent 2-colorings is equal to
Po (2, 2, 2, 2) = (1/8)(24 + 2(2) + 3(27) + 2(2)(2”)) = 6.
We summarize our present findings in the following result.
THEOREM 16.19 Let S be a set of configurations that are acted upon by a permutation group G. [G is a
subgroup of S,, the group of all permutations of {1, 2, 3,...,n}, and the cycle index
Po (x1, X2, X3,.-.,X%,) of Gis
(1/|G]) ys (cycle structure representation of 7r).]
xzEG
The number of nonequivalent m-colorings of Sis then Pg(m, m, m,..., m).
We close this section with an example that uses this theorem.
In how many distinct ways can we 4-color the vertices of a regular hexagon that is free to
EXAMPLE 16.32
move in space?
788 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
For a regular hexagon there are twelve rigid motions: (a) the six clockwise rotations
through 0°, 60°, 120°, 180°, 240°, and 300°; (b) the three reflections in diagonals through
opposite vertices; and (c) the three reflections about lines passing through the midpoints of
opposite edges.
(1) (1)(2)(3)(4)(5)(6)_ x8 (7) (1)(26)(35)(4) x?x2
(2) (123456) X 6 2 (8) (13)(46)(2)(5)—xtx3
(3) (135)(246) x2 (9) (15)(24)(3)(@) x}x3
(4) (14)(25)(36) x3 (10) (12)(36)(45) x3
(5) (153)(264) x? 5 3 (11) (14)(23)(56) x3
(6) (165432) X< (12) (16)(25)(34) x3
Figure 16.10
In Fig. 16.10 we have listed each group element as a product of disjoint cycles, together
with its cycle structure representation. Here
PG (X1, X2, X3, X4, 5, X6) = (1/12)(xP + 2x6 + 2x5 + 4x3 + 3x75),
and there are
Pg (4, 4, 4, 4, 4, 4) = (1/12)(4° + 2(4) + 2(4’) + 404%) + 3(47)(4°)) = 430
nonequivalent 4-colorings of a regular hexagon. (Note: Even though neither x4 nor x5 occurs
in acycle structure representation, we may list these variables among the arguments of Pg.)
4. a) Inhow many ways can we 3-color the vertices of a regular
EXERCISES 16.11 hexagon that is free to move in space?
1. In how many ways can we 5-color the vertices of a square b) Give a combinatorial argument to show that for all m €
that is free to move in (a) two dimensions? (b) three dimensions? Z*, (m® + 2m + 2m? + 4m? + 3m‘) is divisible by 12.
5. a) Inhow many ways can we 5-color the vertices of a regular
2. Answer Exercise | for a regular pentagon. hexagon that is free to move in two dimensions?
b) Answer part (a) if the hexagon is free to move in three
3. Find the number of nonequivalent 4-colorings of the vertices
dimensions.
in the configurations shown in Fig. 16.11 when they are free to
move in (a) two dimensions; (b) three dimensions. c) Find two 5-colorings that are equivalent for case (b) but
distinct for case (a).
6. In how many distinct ways can we 3-color the edges in the
configurations shown in Fig. 16.11 if they are free to move in
(a) two dimensions; (b) three dimensions?
7. a) In how many distinct ways can we 3-color the edges of a
square that is free to move in three dimensions?
b) In how many distinct ways can we 3-color both the ver-
tices and the edges of such a square?
c) For a square that can move in three dimensions, let k,
m, and n denote the number of distinct ways in which we
can 3-color its vertices (alone), its edges (alone), and both
its vertices and edges, respectively. Does n = km? (Give a
Figure 16.11 geometric explanation.)
16.12 The Pattern Inventory: Polya’s Method of Enumeration 789
16.12
The Pattern Inventory:
Polya’s Method of Enumeration
In this final section we return to Example 16.28 and its continued analysis in Section 16.11.
At this time we introduce the pattern inventory and how it is derived from the cycle index.
For zo € G, every configuration in S is invariant. The cycle structure (representation)
for 79 is given by x}, where for each cycle of length | we have a choice of coloring the
vertex in that cycle red (r) or white (w). Using + to represent exclusive or, we write r + w
to denote the two choices for that vertex (cycle of length 1). With four such cycles, (r + w)4
generates the patterns of the 16 configurations.
In the case of 7; = (1234), x4 denotes the cycle structure, and here all four vertices must
be the same color for the configuration to remain fixed under z*. Consequently, we have
all four vertices red or all four vertices white, and we express this algebraically by r+ + w*.
At this point we notice that for each of the permutations we have considered, the number
of factors in the expression used to generate the patterns fixed under a certain permutation
equals the number of factors in the cycle structure (representation) of that permutation. Is
this just a coincidence?
Continue now with r; = (14)(23), whose cycle structure is x5. For the cycle (14) we
must color both of the vertices | and 4 either red or white. These choices are represented by
r* + w’. Since there are two such cycles of length 2, we find that (r* + w?)? will generate
the patterns of the configurations in S fixed under r}*. Once again the number of factors in
the cycle structure equals the number of factors in the corresponding term used to generate
the patterns.
Last, for r; = (13)(2)(4), the cycle structure is xox? = x? xD. For each of the cycles (2)
and (4), r + w represents the choices for each of these vertices, so that (r + w)? accounts
for all four colorings of the pair. The cycle (13) indicates that vertices 1 and 3 must have
the same color; r? + w? accounts for the two possibilities. Therefore, (r + w)*(r7 + w)
generates the patterns of the configurations in S fixed under rj’, and we find three factors in
both the cycle structure and the product (r + w)?(r? + w?). But even more comes to light
here.
Looking at the terms in the cycle structures, we see that, for | <i <n, the factor x, in
the cycle structure corresponds with the term r' + w’ in the expression used to generate the
patterns.
Continuing with the cycle structures for 2, 73, r2, and r4, we find that the pattern
inventory can be obtained by replacing each x, in Pg(x), x2, x3, x4) with r' + w', for
1 <i <4, Consequently,
Po(r+w, rr w, rP+tuwiyrtt+ w*) =r4trwt2rrw?+ru2 t+ wt.
(This result is (1/8)-th of the complete inventory listed in Table 16.10.)
If we had three colors (red, white, and blue), the replacement for x; would be r? + w! +
b', where 1 <i <4.
We generalize these observations in the following theorem.
THEOREM 16.20 Polya’s Method of Enumeration. Let S be a set of configurations that are acted upon by a per-
mutation group G, where G is a subgroup of S, and G has cycle index Pg (x), x2, .-., Xn).
790 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
Then the pattern inventory of nonequivalent m-colorings of S is given by
where c}, C2, ..., Cm denote the m colors that are available.
One important point should be reiterated here before applying Theorem 16.20 — namely,
the pattern inventory is another example of a generating function. Having made that point,
we now apply this theorem in the following examples.
A child’s bracelet is formed by placing three beads — red, white, and blue — on a circular
EXAMPLE 16.33
piece of wire. Bracelets are considered equivalent if one can be obtained from the other by
a (planar) rotation. Find the pattern inventory for these bracelets.
Here G is the group of rotations of an equilateral triangle, so G = {(1)(2)(3), (123),
(132)}, where 1, 2, 3 denote the vertices of the triangle. Then Pg (x1, x2, x3) = (1/3) -
(x} -+ 2x3), and the pattern inventory
is given by (1/3) [(r + w + 6)? + 20° + w3 + b3)] =
(1/3)[3r3 + 3r2w + 3r7b + 3rw? + 6rwb + 3rb* + 3w + 3w2b + 3wh? + 363] =
P+rwt+ Pb+rw? + 2rwbht+rb?+w> + wb + wh? +b. We interpret this result
as follows:
1) For each summand, other than 2rwd, the coefficient is 1 because there is only one
(distinct) bracelet of that type. That is, there is one bracelet with three red beads (for
r3), one with two red beads and one white bead (for r*w), and so on for the other
seven summands with coefficient 1.
2) The summand 2rwb has coefficient 2 because there are two nonequivalent bracelets
with one red, one white, and one blue bead
— as shown in Fig. 16.12.
If the bracelets can also be reflected, then G becomes {(1)(2)(3), (123), (132), (1)(23),
(2)(13), (3)(12)}, and the pattern inventory here is the same as the one above, with one
exception. Here we have rwb, instead of 2r wb, because the nonequivalent (for rotations)
Figure 16.12 patterns in Fig. 16.12 become equivalent when reflections are allowed.
Consider the 3-colorings of the configurations in Example 16.28. If the three colors are red,
EXAMPLE 16.34
white, and blue, how many nonequivalent configurations have exactly two red vertices?
Given that Pg (x1, x2, x3, 4) = (1/8) (xt + 2x4 + 3x3 + 2x2x7), the answer is the sum
of the coefficients of r?w?, r7b?, and r*wb in (1/8)[(r + w + b)* + 2(r4 + wt + bY) +
3(r? + ww? + b*)? + 2(r? + ww? + b*)(r + w + b)?).
In (7 + w +5)‘, we find the term 6r7w? + 6r2b? + 1272 wh. For 3(r? + w? + b?)?,
we are interested in the term 6r?w* + 6r7b?, whereas 4r?w? + 4r2b? + 4r2bw arises in
2072+ uw? +b) (r+w+b).
Then (1/8)[6r?w? + 6r7b? + 12r2wh + 6r2w? + 6r?b? + 4r2w? + 4r2b? + 4r2bw] =
2r*w* + 2r*b? + 2r7bw, the inventory of the six nonequivalent confi gurations that contain
exactly two red vertices.
Our next example deals with the pattern inventory for the 2-colorings of the vertices of
a cube. (The colors are red and white.)
16.12 The Pattern Inventory: Polya’s Method of Enumeration 791
EXAMPLE 16.35 For the cube in Fig. 16.13, we find that its group G of rigid motions consists of the following
1) The identity transformation with cycle structure x?.
2) Rotations through 90°, 180°, and 270° about an axis through the centers of two
opposite faces: From Fig. 16.13(a) we have
90° rotation: (1234)(5678) Cycle structure: x4
180° rotation: (13)(24)(57)(68) Cycle structure: x5
270° rotation: (1432)(5876) Cycle structure: x?
Since there are two other pairs of opposite faces, these nine rotations account for
the term 3x} + 6xj in the cycle index.
3) Rotations through 180° about an axis through the midpoints of two opposite edges:
As in Fig. 16.13(b), we have the permutation (17)(28)(34) (56), whose cycle structure
is given by x5. With six pairs of opposite edges, these rotations contribute the term
6x}; to the cycle index.
4) Rotations through 120° and 240° about an axis through two diagonally opposite
vertices: From part (c) of the figure we have
120° rotation: (168)(274)(3)(5) Cycle structure: x73
240° rotation: (186)(247)(3)(5) Cycle structure: x7x3
Here there are four such pairs of vertices, and these give rise to the term 8x7x3 in the
cycle index.
180°
3 2 mm 3 2 3 2
| } |
I I {
1 | {
4 1 4 1 4 1
I I I
| | |
| \ {
7j__|.---L--Jeé
4
7j-___S ---J6
7
7) _\_L__Jg¢
7
7 oa 7
4 4 7
4 4 4
8 -ls 5 g 5 8 9
90°, 180°, 270° oO 940°
(a) (b) (c) ’
Figure 16.13
Therefore, Pg (x1, x2, ..., *3) = (1/24)(x8 + 9x3 + 6x} + 8x7x?), and the pattern in-
ventory for these configurations is given by the generating function
f(r, w) (1/24)[(r + w)® + 9(r? + w?)* + 60-4 + wt)? + 80r + w)?2r3 + w3)?]
=r +rlw + 3row? + 3r°w? + 7r4wt + 3r3 wd + 3r2wo + rw? + we.
Replacing r and w by 1, we find 23 nonequivalent configurations here.
Since Polya’s Method of Enumeration was first developed in order to count isomers of
organic compounds, we close this section with an application that deals with a certain class
792 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
of organic compounds. This is based on an example by C. L. Liu. (See pp. 152-154 of
reference [17].)
Here we are concerned with organic molecules of the form shown in Fig. 16.14, where
EXAMPLE 16.36
C is a carbon atom and X denotes any of the following components: Br (bromine), H
(hydrogen), CH3 (methyl), or C2Hs (ethyl). For example, if each X is replaced by H, the
compound CH, (methane) results. Figure 16.14 should not be allowed to mislead us. The
structure of these organic compounds is three-dimensional. Consequently, we turn to the
regular tetrahedron in order to model this structure. We would place the carbon atom at the
center of the tetrahedron and then place our selections for X at vertices 1, 2, 3, and 4 as
shown in Fig. 16.15.
| 4 2 4 2
x—— c ——x 3 3 OX
| <a! 180°
120°, 240°
x (a) ‘ (b)b
Figure 16.14 Figure 16.15
The group G acting on these configurations is given as follows:
1) The identity transformation (1)(2)(3)(4) with cycle structure x.
2) Rotations through 120° or 240° about an axis through a vertex and the center of the
opposite face: As Fig. 16.15(a) shows, we have
120° rotation: (1)(243) with cycle structure x | x3
240° rotation: (1)(234) with cycle structure x)x3
By symmetry there are three other pairs of vertices and opposite faces, so these rigid
motions account for the term 8x;x3 in Pg (Xx), X2, x3, X4).
3) Rotations of 180° about an axis through the midpoints of two opposite edges: The
case shown in part (b) of the figure is given by the permutation (14)(23) whose
cycle structure is x3. With three pairs of opposite edges, we get the term 3x5 in
Pg (x1, X2, 3, X4).
Hence Pg (x1, X2, X3, x4) = (1/12)Lx7 + 8X1 X3 + 3x5] and Pg 4, 4, 4, 4) = (1/12) ,
[4* + 8(47) + 3(4)] = 36, so there are 36 distinct organic compounds that can be formed
in this way.
Last, if we wish to know how many of these compounds have exactly two bromine
atoms, we let w, x, y, and z represent the “colors” Br, H, CH3, and Co Hs, respectively, and
find the sum of the coefficients of w*x?, w?y’, w*z?, w2xy, w*xz, and w’yz in the pattern
inventory
(1/12)[(w+xty+z)t4+8wtextytz(wi tx + yi 423) 4+ 30? x74 y? 4 2°)7].
16.12 The Pattern Inventory: Polya’s Method of Enumeration 793
For (w+x+y+z)* the relevant term is 6w7x* + 6w?y* + 6w?z* + 12w?xy
+ 12w*xz + 12w*yz. The middle summand of the pattern inventory does not give rise
to any of the desired configurations, whereas in 3(w? + x? + y? + z*)* we find 6w?x? +
6wy? + 6w?z?,
Consequently that part of the pattern inventory for the compounds containing exactly
two bromine atoms is
(1/12)[12w2x? + 12w?y? + 12w2z? + 12w?xy + 12w?xz + 127 yz]
and there are six such organic compounds.
7. a) In how many ways can we paint the eight squares of a
2 X 4chessboard, using the colors red and white? (The back
of the chessboard is black cardboard.)
1. a) Find the pattern inventory for the 2-colorings of the edges
b) Find the pattern inventory for the colorings in part (a).
of a square that is free to move in (i) two dimensions; (ii) three
dimensions, (Let the colors be red and white.) c) How many of the colorings in part (a) have four red and
four white squares? How many have six red and two white
b) Answer part (a) for 3-colorings, where the colors are red,
squares?
white, and blue.
8. a) In how many ways can we 2-color the eight regions of
2. If a regular pentagon is free to move in space and we can
the pinwheel shown in Fig. 16.16, using the colors black and
color its vertices with red, white, and blue paint, how many
gold, if the back of each region remains grey?
nonequivalent configurations have exactly three red vertices?
How many have two red, one white, and two blue vertices? b) Answer part (a) for the possible 3-colorings, using black,
3. Suppose that in Example 16.35 we 2-color the faces of the gold, and blue paints to color the regions.
cube, which is free to move in space. c) For the colorings in part (b), how many have four black,
a) How many distinct 2-colorings are there for this situa- two gold, and two blue regions?
tion?
b) If the available colors are red and white, determine the
pattern inventory.
c) How many nonequivalent colorings have three red and
three white faces?
4, For the organic compounds in Example 16.36, how many
have at least one bromine atom? How many have exactly three
hydrogen atoms?
5. Find the pattern inventories for the 2-colorings of the ver-
tices in the configurations in Fig. 16.11, when they are free to
move in space. (Let the colors be green and gold.)
6. a) In how many ways can the seven (identical) horses on
a carousel be painted with black, brown, and white paint in
such a way that there are three black, two brown, and two Figure 16.1
white horses?
b) Inhow many ways would there be equal numbers of black 9, Letm, n € Z* withn > 3. How many distinct summands ap-
and brown horses? pear in the pattern inventory for the m-colorings of the vertices
of aregular polygon of n sides?
¢) Give a combinatorial argument to verify that for all
néZ*,n’ + 6n is divisible by 7.
794 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
16.13
Summary and Historical Review
Although the notion of a group of transformations evolved gradually in the study of ge-
ometry, the major thrust in the development of the group concept came from the study of
polynomial equations.
Methods for solving quadratic equations were known to the ancient Greeks. Then in
the sixteenth century, advances were made toward solving cubic and quartic polynomial
equations where the coefficients were rational numbers. Continuing with polynomials of
fifth and higher degree, both Leonhard Euler (1707-1783) and Joseph-Louis Lagrange
(1736-1813) attempted to solve the general quintic. Lagrange realized there had to be
a connection between the degree n of a polynomial equation and the permutation group
S,. However, it was Niels Henrik Abel (1802-1829) who finally proved that it was not
possible to find a formula for solving the general quintic using only addition, subtraction,
multiplication, division, and root extraction. During this same period, the existence of a
necessary and sufficient condition for when a polynomial of degree n > 5 with rational
coefficients can be solved by radicals was investigated and solved by the illustrious French
mathematician Evariste Galois (181 1—1832). Since the work of Galois utilizes the structures
of both groups and fields, we shall say more about him in the summary of Chapter 17.
Niels Henrik Abel (1802-1829)
Examining pages 278-280 of J. Stillwell [28], one finds that the group concept, and
in fact the actual word “group,” first appears in Galois’ work Mémoire sur les conditions
de résolubilité des équations par radicaux, published in 1831. Associativity, the group
identity, and inverses were consequences of Galois’ assumptions, for he only dealt with
a group of permutations of a finite set and his definition of a group required only the
closure property. It was Arthur Cayley (1821-1895) (in 1854, in his paper On the Theory
of Groups, as Depending on the Symbolic Equation 6” = 1) who first found the need to
state the associative property for group elements. The first actual mention of inverses in the
definition of a group occurs in the 1883 article Gruppentheoretischen Studien II by Walther
Franz Anton von Dyck (1856-1934).
16.13. Summary and Historical Review 795
The concept of the coset, which we introduced in Section 16.3, was also developed by
Evariste Galois (in 1832). The actual term was coined (in 1910) by George Abram Miller
(1863-1951).
Following the accomplishments of Galois, group theory affected many areas of mathe-
matics. During the late nineteenth century, for example, the German mathematician Felix
Klein (1849-1929), in what has come to be known as the Erlanger Programm, attempted
to codify all existing geometries according to the group of transformations under which the
properties of the geometry were invariant.
Many other mathematicians, such as Augustin-Louis Cauchy (1789-1857), Arthur Cay-
ley (1821-1895), Ludwig Sylow (1832-1918), Richard Dedekind (1831-1916), and
Leopold Kronecker (1823-1891), contributed to the further development of certain types
of groups. However, it was not until 1900 that lists of defining conditions were given for
the general abstract group.
During the twentieth century a great deal of research took place in the attempt to analyze
the structure of finite groups. For finite abelian groups, it is known that any such group is
isomorphic to a direct product of cyclic groups of prime power order. However, the case
of the finite nonabelian groups has turned out to be considerably more complex. Starting
with the work of Galois, one finds particular attention paid to a special type of subgroup
called a normal subgroup. For any group G, a subgroup H (of G) is called normal if,
for all g € G and all h € H, we have ghg™! € H. In an abelian group every subgroup is
normal, but this is not the case for nonabelian groups. In every group G, both {e} and G are
normal subgroups, but if G has no other normal subgroups it is called simple. During the
past six decades mathematicians have sought and determined all the finite simple groups
and examined their role in the structure of all finite groups. Among the prime movers in the
classification of the finite simple groups are Professors Walter Feit, John Thompson, Daniel
Gorenstein, Michael Aschbacher, and Robert Griess, Jr. For more on the history and impact
of this monumental work we refer the reader to the articles by J. A. Gallian [5], A. Gardiner
[7], M. Gardner [9], R. Silvestri [27], and, especially, the one by D. Gorenstein [13].
There are many texts one can turn to for further study in the theory of groups. At the
introductory level, the texts by J. A. Gallian [6] and V. H. Larney [16] provide further
coverage beyond the introduction given in this chapter. The text by I. N. Herstein [15] is an
excellent source and includes material on Galois theory.
More on the RSA public-key cryptosystem of Section 16.4 can be found in the references
by T. H. Barr [2], P. Garrett [10], and W. Trappe and L. C. Washington [31]. An early
description of the system is given in the article by M. Gardner [8], where a message is
encrypted using, as the modulus n, the product of a 64-digit prime and a 65-digit prime.
The article by G. Taubes [30] relates the effort set forth by Arjen Lenstra, Paul Leyland,
Michael Graff, and Derek Atkins, along with 600 volunteers, in factoring n.
The beginnings of algebraic coding theory can be traced to 1941, when Claude Elwood
Shannon began his investigations of problems in communications. These problems were
prompted by the needs of the war effort. His research resulted in many new ideas and
principles that were later published in 1948 in the journal article [26]. As a result of this
work, Shannon is acknowledged as the founder of information theory. After this publication,
results by M. J. E. Golay [11] and R. W. Hamming [14] soon followed, giving further impetus
to research in this area. The 1478 references listed in the bibliography at the end of Volume
II of the texts by F J. MacWilliams and N. J. A. Sloane [18] should convey some idea of
the activity in this area between 1950 and 1975.
796 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
Our coverage of coding theory followed the development in Chapter 5 of the text
by L. L. Dornhoff and F. E. Hohn [4]. The texts by E. F Assmus, Jr, and J. D. Key
[1], S. W. Golomb, R. A. Scholtz, and R. E. Peile [12], V. Pless [20], and S. Roman [24]
provide a nice coverage of topics at a fairly intermediate level. More advanced work in
coding can be found in the books by F. J. MacWilliams and N. J. A. Sloane [18], S. Roman
[25], and A. P. Street and W. D. Wallis [29]. An interesting application on the use of the
pigeonhole principle in coding theory is given in Chapter XI of [29].
In Sections 10, 11, and 12 of the chapter, we came upon an enumeration technique whose
development is attributed to the Hungarian mathematician George Polya (1887-1985). His
article [21] provided the fundamental techniques for counting equivalence classes of chem-
ical isomers, graphs, and trees. (To some extent, the ideas in this work were anticipated by
J. H. Redfield [23].) Since then these techniques have been found invaluable for counting
problems in such areas as the electronic realizations of Boolean functions. Polya’s fun-
damental theorem was first generalized in the article by N. G. DeBruijn [3], and other
extensions of these ideas can be found in the literature. The article by R. C. Read [22]
relates the profound influence that Polya’s Theorem has had on developments in combina-
torial analysis. (The issue of the journal that contains this article also includes several other
articles dealing with the life and work of George Polya.)
Our coverage of this topic follows the presentation given in the article by A. Tucker
[32]. A more rigorous presentation of this method can be found in Chapter 5 of the text by
C. L. Liu [17].
In dealing with Burnside’s Theorem we have another instance of an inaccurate attribution.
As we learn in the article by P. M. Neumann [19], the result appears in a paper by Georg
Frobenius (1848-1917) that was published in 1887, as well as in some of Cauchy’s work
from 1845.
REFERENCES
1. Assmus, E. F., Jr., and Key, J. D. Designs and Their Codes. New York: Cambridge University
Press, 1992.
2. Barr, Thomas H. /nvitation to Cryptology. Upper Saddle River, N. J.: Prentice-Hall, 2002.
3. DeBruijn, Nicolaas Govert. “Polya’s Theory of Counting.” Chapter 5 in Applied Combinatorial
Mathematics, ed. by Edwin F. Beckenbach. New York: Wiley, 1964.
4. Dornhoff, Larry L., and Hohn, Franz E. Applied Modern Algebra. New York: Macmillan, 1978.
5. Gallian, Joseph A. “The Search for Finite Simple Groups.” Mathematics Magazine 49, 1976,
pp. 163-179.
6. Gallian, Joseph A. Contemporary Abstract Algebra, 5th ed. Boston, Mass.: Houghton Mifflin,
2002.
7. Gardiner, Anthony. “Groups of Monsters.” New Scientist, April 5, 1979, p. 34.
8. Gardner, Martin. “A New Kind of Cipher That Would Take Millions of Years to Break.”
Scientific American (August 1977): pp. 120-124.
9. Gardner, Martin. “The Capture of the Monster: A Mathematical Group with a Ridiculous
Number of Elements.” Scientific American 242 (6), 1980, pp. 20-32.
10. Garrett, Paul. Making, Breaking Codes: An Introduction to Cryptology. Upper Saddle River,
N. J.: Prentice-Hall, 2001.
11. Golay, Marcel J. E. “Notes on Digital Coding.” Proceedings of the IRE 37, 1949, p. 657.
12. Golomb, Solomon W., Scholtz, Robert A., and Peile, Robert E. Basic Concepts in Information
Theory and Coding. New York: Plenum, 1994.
13. Gorenstein, Daniel. “The Enormous Theorem.” Scientific American 253 (6), 1985, pp. 104—
115.
14. Hamming, Richard Wesley. “Error Detecting and Error Correcting Codes.” Bell System
Technical Journal 29, 1950, pp. 147-160.
Supplementary Exercises 797
. Herstein, Israel Nathan. Topics in Algebra, 2nd ed. Lexington, Mass.: Xerox College Publish-
ing, 1975.
16, Larney, Violet H. Abstract Algebra: A First Course. Boston: Prindle, Weber & Schmidt, 1975.
17. Liu, C. L. Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
18. Mac Williams, F. Jessie, and Sloane, Neil J. A. The Theory of Error-Correcting Codes, Volumes
I and Il. Amsterdam: North-Holland, 1977.
19. Neumann, Peter M. “A Lemma That Is Not Burnside’s.” The Mathematical Scientist, Vol. 4,
1979, pp. 133-141.
20. Pless, Vera. Introduction to the Theory of Error-Correcting Codes, 2nd ed. New York: Wiley,
1989.
21. Polya, George. “Kombinatorische Anzahlbestimmungen fiir Gruppen, Graphen und Chemishe
Verbindungen.” Acta Mathematica 68, 1937, pp. 145-254.
22. Read, R. C. “Polya’s Theorem and Its Progeny.” Mathematics Magazine 60, 1987, pp. 275-282.
23, Redfield, J. Howard. “The Theory of Group Reduced Distributions.” American Journal of
Mathematics 49, 1927, pp. 433-455.
24. Roman, Steven. Introduction to Coding and Information Theory. New York: Springer-Verlag,
1997.
25. Roman, Steven. Coding and Information Theory. New York: Springer-Verlag, 1992.
26. Shannon, Claude E. “The Mathematical Theory of Communication.” Bell System Technical
Journal 27, 1948, pp. 379-423, 623-656. Reprinted in C. E. Shannon and W. Weaver, The
Mathematical Theory of Communication (Urbana: University of [linois Press, 1949).
27, Silvestri, Richard. “Simple Groups of Finite Order.” Archive for the History of Exact Sciences
20, 1979, pp. 313-356.
28. Stillwell, John. Mathematics and Its History. New York: Springer-Verlag, 1989.
29. Street, Anne Penfold, and Wallis, W. D. Combinatorial Theory: An Introduction. Winnipeg,
Canada: The Charles Babbage Research Center, 1977.
30. Taubes, G. “Small Army of Code-breakers Conquers a 129-digit Giant.” Science 264, 1994,
pp. 776-777.
31, Trappe, Wade, and Washington, Lawrence C. Introduction to Cryptography with Coding
Theory. Upper Saddle River, N. J.: Prentice-Hall, 2002.
32. Tucker, Alan. “Polya’s Enumeration Formula by Example.” Mathematics Magazine 47, 1974,
pp. 248-256.
and b+ d are computed using addition modulo 2. What
SUPPLEMENTARY EXERCISES is the value of (1, 0) @ (0, 1) @ (I, 1) in this group?
b) Now consider the group (Z2 X Z2 X Z., ®@) where
(a, b,c) B (d,e, f) =(a+d,b+e,c+ f). (Here the
1, Let f: G — H beagroup homomorphism with e;,; the iden- sums a+d,b+e,c+ f are computed using addition
tity in H. Prove that modulo 2.) What do we obtain when we add the seven
a) K = {x €G| f(x) = ey} is a subgroup of G. (K is nonzero (or nonidentity) elements of this group?
called the kerne/ of the homomorphism.) c) State and prove a generalization that includes the results
b) ifg ¢Gandx eK, thengxg 'eEK. in parts (a) and (b).
2. If G, H, and K are groups and G = H X K, prove that G 7, Let (G, 0) be a group where
contains subgroups that are isomorphic to H and K.
xodoy=boaocSxoy=hboc,
3. Let G be a group where a* = e for all a € G. Prove that G
for all a, b, c, x, y € G. Prove that (G, 0) is an abelian group.
is abelian.
8. Fork,n eZ withn >k > 1, let O(n, k) count the num-
4, If G is a group of even order, prove that there is an element
ber of permutations z € S, where any representation of 7, as
aeéGwitha #eanda=a"!.
a product of disjoint cycles, contains no cycle of length greater
5. Let f: G —» H bea group homomorphism onto H. If G is than k, Verify that
acyclic group, prove that H is also cyclic. k-l
6. a) Consider the group (Zo X Zo, ®) where, fora, b, c,d € Qm+1EQ= > ("J evoe — i,k).
Z>, (a, b) @ (c,d) =(a+c,b+d)—the sums a+c =o \!
798 Chapter 16 Groups, Coding Theory, and Polya’s Method of Enumeration
9, For k,n € Z* where n > 2 and 1 <k <n, let P(n, k) de- 11. Wilson’s Theorem [in part (d) of Exercise 19 of Section
note the number of permutations z ¢€ S, that have k cycles. [For 16.1] tells us that (p — 1)! = —1 (mod p), for p a prime.
example, (1)(23) is counted in P(3, 2), (12)(34) is counted in a) Is the converse of this theorem true or false
— that is,
P(4, 2), and (1)(23)(4) is counted in P(4, 3).] ifn e€ Z* and n > 2, does (n — 1)! =—1 (modn) > 7 is
a) Verify that P(n + 1,k) = P(n,k —1)+nP(n,k). prime?
b) Determine }77_, P(n, k). b) For p an odd prime, prove that
10. Forn > 1, ifo, rt € S,, define the distance d(o, tT) between 2(p — 3)! =—-1 (mod p).
o and tr by
12. In how many ways can Nicole paint the eight regions of the
square shown in Fig. 16.17 if
d(o, T) = max{jo(i) — tr) ||1 <i <n}.
a) five colors are available?
b) she actually uses exactly four of the five available
a) Prove that the following properties hold for d. colors?
i) d(o, t) > Oforallo, 7 «€ S,
ii) d(o, t) = Oif and only ifo =t
iii) d(o, t) = d(t, o) forallo,7 € 5S,
iv) d(p, t)<d(p,a)+d(o, Tt), forall o, 0. t € S,
b) Let € denote the identity element of S,, (that is, €(i) =i
for all 1 <i <n). If 7 € S, and d(z, €) < 1, what can we
say about z(n)?
¢) Forn > | let a, count the number of permutations 7 in
S,, where d(zr, €) < 1. Find and solve a recurrence relation '
for a,. Figure 16.17
7
Finite Fields and
Combinatorial
Designs
[: is time now to recall the ring structure of Chapter 14 as we examine rings of polynomials
and their role in the construction of finite fields. We know that for every prime p, (Z,, +, «)
is a finite field, but here we shall find other finite fields. Just as the order of a finite Boolean
algebra is restricted to powers of 2, for finite fields the possible orders are p", where p is
a prime and n € Z*. Applications of these finite fields will include a discussion of such
combinatorial designs as Latin squares. Finally, we shall investigate the structure of a finite
geometry and discover how these geometries and combinatorial designs are interrelated.
7.1
Polynomial Rings
We recall that a ring (R, +, +) consists of a nonempty set R, where (R, +) is an abelian
group, (R, -) is closed under the associative operation -, and the two operations are related
by the distributive laws: a(b + c) = ab + ac and (b+ c)a = ba 4+ ca, forall a, b,c ER.
(We write ab for a + b.)
In order to introduce the formal concept of a polynomial with coefficients in R we let x
denote an indeterminate — that is, a formal symbol that is not an element of the ring R. We
then use this symbol x to define the following.
Definition 17.1 Given aring (R, +, +),an expression of the form f (x) = a,x” + ay_yx"7! +--+ + ayx! +
ayx°, where a; € R for all 0 <i <n, is called a polynomial in the indeterminate x with
coefficients from R.
If a, is not the zero element of R, then a, is called the leading coefficient of f(x) and we
say that f(x) has degree n. Hence the degree of a polynomial is the highest power of x that
occurs in a summand of the polynomial. The term aox° is called the constant, or constant
term, of f(x).
If g(x) = by x” + Dy x™ | +--+. + bx! + box is also a polynomial inx over R, then
F(x) = g(x) ifm =n anda; = 6; forallO <i <n.
Finally, we use the notation R[x] to represent the set of all polynomials in the indeter-
minate x with coefficients from R.
799
800 Chapter 17 Finite Fields and Combinatorial Designs
a) Over the ring R = (Ze, +, -), the expression 5x? + 3x! —2x° is a polynomial of
EXAMPLE 17.1
degree 2, with leading coefficient 5 and constant term —2x°. As before, here we are
using a to denote [a] in Zs. This polynomial may also be written as 5x* + 3x! + 4x°
since [4] = [—2] in Ze.
b) If z is the zero element of ring R, then the zero polynomial zx° = z is also the zero
element of R[x] and is said to have no degree and no leading coefficient. A polynomial
over R that is the zero element or is of degree 0 1s called a constant polynomial. For
example, the polynomial 5x° over Z7 has degree 0 and leading coefficient 5 and is a
constant polynomial.
For a ring of coefficients (R, +, +), let
F(x) = ayx" + ayiix"| +++ tax! + agx®
g(x) = By x” + bmx"! +--+ + bx! + box®,
where a;, b; € R for allO <i <n,O< j <_m. We introduce (closed binary) operations of
addition and multiplication for these polynomials in order to obtain a new ring.
Assume that n > m. We define
fx) +a) = )o@ +bdx"', (1)
i=0
where b; = z fori > m, and
F(x)ag(x) = (dn Dy, )xrt™ + (An Dm—| + Gn 1D, xr rm!
+ +++ + (aybo + anb,)x! + (aobo)x?. (2)
In the definition of f(x) + g(x), the coefficient (a; + b;), foreachO <7 <n, is obtained
from the addition of elements in R. For f(x)g(x), the coefficient of x’ is > i= GQ, —~Dx,
where all additions and multiplications occur within R, and 0 <t<n-+=m. Here is one
such example to demonstrate the types of calculations that are involved.
Let f(x) = 4x3 + 2x7 + 3x! + Lx° and g(x) = 3x7 + x! + 2x° be polynomials from
Z5[x]. Here
a; = 4, az = 2, a, = 3, ay
= 1,
and
b> = 3, by = 1, by = 2.
For all n > 4 we find that a, = 0. When m > 3 we have b,, = 0. Using the definitions in
Egs. (1) and (2), where the addition and multiplication of the coefficients are now performed
modulo 5, we obtain
f(x) + g(x) = (44 00x73 + (24+ 3)x7 +34 Ix! 4+ 1 4+2)x°
= 4x3 +0x7 + 4x! 4+ 3x9 = 4x3 44x! 43°
and
5 4 3
f(x)g(x) = (> tsb) x (>: cabs) xo + (> oss) x?
k=0 k=0
+
17.1 Polynomial Rings 801
= (0-2+0-14+4-34+2-0+3-0+4+1-0)x°
+(0-2+4-142-343-041-0)x*
+ (4-24+2-143-34+1-0)x°
+ (2-243-141-3)x74+(-24+1-Dx'+(1-2)x°
= 2x9 + Oxt + 4x3 + Ox? + 2x! +. 2x9 = 2x5 4 4x3 4 2x! 4 2x9,
The closed binary operations defined in Eqs. (1) and (2) were designed to give us the
following result.
THEOREM 17.1 If R is aring, then under the operations of addition and multiplication given in Eqs. (1) and
(2), (R[x], +, -) is a ring, called the polynomial ring, or ring of polynomials, over R.
Proof: The ring properties for R[x] hinge upon those of R. Consequently, we shall prove the
associative law of multiplication here, as an example, and shall then leave the proofs of the
other properties to the reader. Let h(x) = t=0 cyx*, with f(x), g(x) as defined earlier.
A typical summand in (f (x)g(x))h(x) has the form Ax’, where 0 <t < (m+n) + p and
A is the sum of all products of the form (a;b;)c,, withO<i<n,0O<j<m,O0<k<p,
andi+j+k=t. In f(x)(g(x)h(x)) the coefficient of x’ is the sum of all products of
the form a;(b;c,), again withO <i<n,O0<j<m,O0<k<p,andi+j+k =f. Since
R is associative under multiplication, (a,b,)cx = a;(bjcx) for each of these terms, and
so the coefficient of x’ in (f(x) g(x))A(x) is the same as it is in f(x)(g(x)h(x)). Hence
(F(xgaya(x) = fX)(gQh@)).
COROLLARY 17.1 Let R[x] be a polynomial ring.
a) If R is commutative, then R[x] 1s commutative.
b) If R is a ring with unity, then R[x] is a ring with unity.
c) R[x] is an integral domain if and only if R is an integral domain.
Proof: The proof of this corollary is left for the reader.
From this point on, we shall write x instead of x!. If R has unity u, we define x° = u,
and for all r € R we write rx° as r.
Let f(x), g(x) € Zg[x] with f(x) = 4x° + 1 and g(x) = 2x + 3. Then f(x) has degree 2
EXAMPLE 17.2 and g(x) has degree J. From our past experiences with polynomials, we expect the degree
of f(x)g(x) to be 3, the sum of the degrees of f(x) and g(x). Here, however, f(x)g(x) =
(4x? + 1)(2x + 3) = 8x3 + 12x27 + 2x 4+3 = 4x7 + 2x +3 because [8] = [0] in Zs. So
degree f(x)g(x) = 2 <3 = degree f(x) + degree g(x).
The cause of the phenomenon in Example 17.2 is the existence of proper divisors of zero
in the ring Zg. This observation leads us to the following theorem.
802 Chapter 17 Finite Fields and Combinatorial Designs
THEOREM 17.2 Let (R, +, +) be acommutative ring with unity u. Then R is an integral domain if and only
if for all f(x), g(x) € R[x], if neither f(x) nor g(x) is the zero polynomial, then
degree f(x)g(x) = degree f(x) + degree g(x).
Proof: Let f(x) = Dif-p Gx', g(x) = Oy bjx!, with ay # 2, bm F z. If R is an integral
domain, then a,b, # z,so degree f(x)g(x) =n +m = degree f(x) + degree g(x). Con-
versely, if R is not an integral domain, leta, b € R witha # z, b # z, but ab = z. The poly-
nomials f(x) = ax + u, g(x) = bx + u each have degree 1, but f(x)g(x) = (a+b)x+u
and degree f(x)g(x) < 1 <2 = degree f(x) + degree g(x).
Before we can proceed we need to recall an idea that was introduced in Section 14.2 —in
Exercise 21. If R is aring with unity wu andr € R, we define r® = u,r! =r,andr"t! = rr
for all n € Z*. [From these definitions one can show, for example, that for all m,n € Zt,
(r™)(r") = r'™™ and (r”)" = r™" | So now we continue as follows.
Let R be a ring with unity uw and let f(x) =a,x" +--+ +a;x+an€ R[x]. If re R,
then f(r) =a,r" +---+a)r+ap € R. We are especially interested in those values of r
for which f(r) = z, and this interest leads us to the following concept.
Definition 17.2 Let R be a ring with unity uw and let f(x) € R[x], with degree f(x) > 1. If re R and
f(r) = z, then r is called a root of the polynomial f(x).
a) If f(x) = x? —2 € R[x], then f(x) has /2 and —/2 as roots because (./2)? — 2 =
EXAMPLE 17.3
0 = (-J/2)?— 2. In addition, we can write f(x) = (x — V2)(x + V2), with
x — J/2,x + /2 © R[x]. However, if we regard f(x) as an element of Q[x], then
f(x) has no roots because /2 and —J/2 are irrational numbers. Consequently, the
existence of roots for a polynomial is dependent on the underlying ring of coefficients.
b) For f(x) = x? + 3x + 2 € Z[x],
we find that
f(0) = O)* +30) +2=2 fG) = GY +38) +2=20=2
fI=AP +30) +2=6=0 = f4) = (4)? +34) +2=30=0
fQ) = (2)? +32)+2=12=0 = (5) = (5)? +3(5) +2 = 42 =0
Consequently, f(x) has four roots: 1, 2, 4, and 5. This is more than we expected. In
our prior experiences, a polynomial of degree 2 had at most two roots.
In this chapter we shall be primarily concerned with polynomial rings F [x], where F
is a field (and F[x] is an integral domain). Consequently, we shall not dwell any further
on situations where degree f(x)g(x) < degree f(x) + degree g(x). In addition, unless it
is stated otherwise, we shall denote the zero element of a field by 0 and use | to denote its
unity.
As a result of Example 17.3(b), we shall now develop the concepts needed to find out
when a polynomial of degree n has at most n roots.
Definition 17.3 Let F be a field. For f(x), g(x) € F[x], where f(x) is not the zero polynomial, we call
J (x) a divisor (or factor) of g(x) if there exists h(x) € F [x] with f (x)A(x) = g(x). In this
situation we also say that f(x) divides g(x) and that g(x) is a multiple of f (x).
171 Polynomial Rings 803
This leads to the division algorithm for polynomials. Before proving the general result,
however, we shall examine two particular examples.
Early in algebra we were taught how to perform the long division of polynomials with
EXAMPLE 17.4
real coefficients. Given two polynomials f(x), g(x) with degree f(x) < degree g(x), we
organized our work in the form
qi(x) + q2(x) ++-- + 4(x) (= G(x)
f(x) g(x)
fx)aqi(x)
a(x) — f(xyqi(x)
> ee 8
r(x)
where we continued to divide until we found either
r(x) =0 or degree r(x) < degree f(x).
It then followed that g(x) = g(x) f(x) + r(x).
For example, if f(x) = x — 3 and g(x) = 7x? — 2x7 + 5x — 2, then f(x), g(x) € Qi]
(or R[x], or C[x]), and we find
7x7 + 19x +62 (=4q(x))
x—3) Txe— 24+ Se— 2
7x3 — 21x?
19x7+ 5x- 2
19x* — 57x
62x — 2
62x — 186
184 (= r(x))
Checking these results, we have
g(x) f(x)+r(x) = (7x? + 19x + 62)(x — 3) + 184 = 7x3 — 2x7 + 5x —2= g(x).
The technique illustrated in Example 17.4 also applies when the coefficients of our poly-
EXAMPLE 17.5
nomials are taken from a finite field.
If f(x) = 3x7 +4x +2 and g(x) = 6x4 + 4x? +5x*+3x+41 are polynomials in
Z;[x], then the process of long division provides the following calculations:
2x74+ x +6 (= 4(x))
3x7 + 4x +2 Joxt + 4x3 4 5x2 43x 41
6x4+ x3 + 4x?
3x8+ x2 43x41
3x3 + 4x2
+ 2x
4x7+ x4]
4x?
+ 3x +5
Sx +3 (=r(x))
804 Chapter 17 Finite Fields and Combinatorial Designs
Performing all arithmetic in Z7, we find (as in Example 17.4) that
q(x) f(x) + r(x) = (2x7 + x + 6)(3x? + 4x + 2) + (Sx + 3)
= 6x4 4+ 4x7 + 5x74 3x41 =g(x)
We turn now to the general situation.
THEOREM 17.3 Division Algorithm. Let f(x), g(x) € F [x] with f (x) not the zero polynomial. There exist
unique polynomials g(x), r(x) € F [x] such that g(x) = g(x)
f (x) + r(x), where r(x) = 0
or degree r(x) < degree f(x).
Proof: Let S = {g(x) — t(x) f(x)|t(x) € FL[x}}.
If 0 € S, then 0 = g(x) — t(x) f(x) for some r(x) € F [x]. Then with g(x) = r(x) and
r(x) = 0, we have g(x) = g(x) f(x) + r(x).
If 0 ¢ S, consider the degrees of the elements of S, and let r(x) = g(x) — g(x) f(x)
be an element in S of minimum degree. Since r(x) # 0, the result follows if degree r(x)
< degree f(x). If not, let
r(x) = ayx" + dy—ix"7! +++ + anx*
+ ayx +0, an #0,
F(X) = Dy x” + bm yx! ++ + box? + byx + bo, bm #0,
with n > m. Define
h(x) = r(x) ~~ [anb;,'x"
m "| f (x) = (ay, _ Andy" Din) x” + (Qn-| ~~ ndy' Bm) x"
tote (An—m —_ anb7'by)x"—™ + Gym px +++ f+ ayx +a.
Then h(x) has degree less than n, the degree of r(x). More important, h(x) =
[g(x) — gx)
f (x)] — [andy
x” "1 f x) = g(x) — (g(x) + andy!
x" "1 f (x), so A(x) € S
and this contradicts the choice of r(x) as having minimum degree. Consequently, degree
r(x) < degree f(x) and we have the existence part of the theorem.
For uniqueness, let g(x) = q(x) f(x) + r(x) = ga(x) f(x) + ro(x) where r)}(x) = 0
or degree r)(x) < degree f(x), and r2(x) =0 or degree r(x) < degree f(x). Then
[q2(x) — qi XA) F@) = ri) — r2(x), and if go(x) — qi(x) #0, then degree ([q2(x) —
qi(x)]f (x)) = degree f(x), whereas rj(x) —72(x) =0 or degree [r| (x) — r2(x)] <
max{degree r;(x), degree r2(x)} < degree f(x). Consequently, g (x) = q2(x), and
r(x) = r2(x).
The division algorithm provides the following results on roots and factors.
THEOREM 17.4 The Remainder Theorem. For f (x) € F[x] anda é€ F, the remainder in the division of f (x)
byx — ais f(a).
Proof: From the division algorithm, f(x) = g(x)(x — a) + r(x), with r(x) = 0 or degree
r(x) < degree (x — a) = |. Hence r(x) =r is an element of F. Substituting a for x, we
find f(a) = g(ayia-—a)+r=04+re=r.
THEOREM 17.5 The Factor Theorem. If f(x) € F[x] anda ¢€ F, then x — a is a factor of f(x) if and only
ifa is a root of f(x).
Proof: Ifx — aisa factor of f(x), then f(x) = g(x)(x — a). With f(a) = g(a)(a — a) = 9,
it follows that a is a root of f(x). Conversely, suppose that a is a root of f(x). By the
171 Polynomial Rings 805
division algorithm, f(x) = g(x)(x — a) +r, wherer € F. Since f(a) = 0 we haver = 0,
so f(x) = g(x)(x — a), and x — a is a factor of f(x).
EXAMPLE 17.6 a) Let f(x) = x’ — 6x° +. 4x4 — x? + 3x — 7 € Q[x]. From the remainder theorem it
* follows that when f(x) is divided by x — 2, the remainder is
f(2) = 27 — 6(2°) + 4(2*) — 27 + 3(2) -7 = -5.
If we were to divide f (x) by x + 1, then the remainder would be f(—1) = —2.
b) If g(x) = x° + 3x7 4+ x3 + x7 4+ 2x 4+ 2 € Zs[x] is divided byx — 1, then the remain-
der here is g(1) =1+34+1+1+4+2+2=0 Gin Zs). Consequently, x — 1 divides
g(x), and by the factor theorem,
e(x) = q(x)(x — 1) (where degree g(x) = 4).
Using the results of Theorems 17.4 and 17.5, we now establish the last major idea for
this section.
THEOREM 17.6 If f(x) € F[x] has degree n > 1, then f (x) has at most n roots in F.
Proof: The proof is by mathematical induction on the degree of f(x). If f(x) has degree
1, then f(x) = ax +b, fora, be F, a #0. With f(—a7'b) =0, f(x) has at least one
root in F’. If c; and cz are both roots, then f(c)) = ac) +b =0 =ac2+b= f(co). By
cancellation in a ring, ac; + b = acz +b > ac, = acp. Since F is a field and a # 0, we
have ac, = @c2 > c) = C2, so f (x) has only one root in F.
Now assume the result of the theorem is true for all polynomials of degree k (> 1) in
F [x]. Consider a polynomial f(x) of degree & + 1. If f(x) has no roots in F, the theorem
follows. Otherwise, let r € F with f(r) = 0. By the factor theorem, f(x) = (x — r)g(x)
where g(x) has degree k. Consequently, by the induction hypothesis, g(x) has at most k
roots in F, and f(x), in turn, has at most k + | roots in F.
EXAMPLE 17.7 a) Let f(x) = x? — 6x +9 € R[x]. Then f(x) has at most two roots in R—namely,
- the roots 3, 3. So here we say that 3 is a root of multiplicity 2. In addition f(x) =
(x — 3)(x — 3), a factorization into two first-degree, or linear, factors.
b) For g(x) = x7 +4 € R[x], g(x) has no real roots, but Theorem 17.6 is not contra-
dicted. (Why?) In CLx], g(x) has the roots 2i, —2i and can be factored as g(x) =
(x — 2i)(x + 22).
c) If h(x) = x? + 2x + 6 € Z,[x], then #(2) = 0, h(3) = O and these are the only roots.
Also, h(x) = (x — 2)(x — 3) = x7 —5x+6=x7+2x+6, because [—5] = [2]
in Z7.
d) As we saw in Example 17.3(b), the polynomial x* + 3x + 2 has four roots. This is
not a contradiction to Theorem 17.6 because Z¢ is not a field. Also, x* + 3x +2 =
(x + 1)(x + 2) = (x + 4)(x + 5), two distinct factorizations.
We close with one final remark, without proof, on the idea of factorization in Fx].
If f(x) € F[x] has degree n, and r),r2,..., 7, are the roots of f(x) in F (where it is
806 Chapter 17 Finite Fields and Combinatorial Designs
possible for a root to be repeated
— that is, r; = r; for some 1 <i < j <n), then f(x) =
An (xX — 711 )(X —1r2)--+ (x — 7,), where a, is the leading coefficient of f(x). This represen-
tation of f(x) is unique up to the order of the first-degree factors.
a) f(x), g(x) € QLx], f(x) = x8 + 7x? — 4x4 4 3x3 +
EXERCISES 17.1 5x* — 4, g(x)
=x —3
1. Let f(x), g(x) € Z7[x] where f(x) = 2x4 + 2x9 + 3x74 b) f(x), g(x) € Zale], fr) = 01 + x90 $8 pO +
x+4 and g(x) =3x°+5x?+6x+4+1. Determine f(x) + le@)=x-1
g(x), f(x) — g(x), and f(x)g (x). ce) f(x), g(x) © Zu lx], f(x) = 3x° — 8x4 txt — x? +
2. Determine all of the polynomials of degree 2 in Z2[x]. 4x —7T, g(x) =x4+9
3. How many polynomials are there of degree 2 in Z,,[x]? 10. For each of the following polynomials f(x) € Z,[x], de-
How many have degree 3? degree 4? degree n, for n € N? termine all of the roots in Z7 and then write f(x) as a product
of first-degree polynomials.
4, a) Find two nonzero polynomials f(x), g(x) in Zy.[x]
where f(x)g(x) = 0. a) f(x) = x2 +5x?4+2x +6
b) Find polynomials A(x), k(x) € Z)2[x] such that degree b) f(x) =x’ —x
h(x) = 5, degree k(x) = 2, and degree h(x)k(x) = 3. 11. How many units are there in the ring Zs[x]? How many in
5. Complete the proofs of Theorem 17.1 and Corollary 17.1. Z;|x]? How many in Z,[x], p a prime?
6. For each of the following pairs f(x), g(x), find g(x), 12. Given a field F, let f(x) € F[x] where f(x) = a,x" +
r(x) so that g(x) = q(x) f(x) + r(x), where r(x) = 0 or de- yx") + +--+ aox? +.a,x + ay. Prove that x — 1 is a fac-
gree r(x) < degree f(x). tor of f(x) if and only if
a) f(x), g(x) € Qia], f(x) =x*—5x9 47x, g(x) = Gn + Qn) +--+ +42 +a) +a = 0.
x? — 2x24 5x —3 13. Let R, S be rings, and let g: R > S be a ring homomor-
b) f(x), 9) € Z[x], fe) =P +h g@)axttxe+ phism. Prove that the function G: R[x] — S[x] defined by
x +x 41
ce) f(x), g(x) € Zs[x], f(x) = x? +3x + Logix) = at + G (s r'] = 3 g(r, )x'
2x>+x4+4 1=0 i=0
is aring homomorphism.
7. a) If f(x) = x* — 16, find its roots and factorization in
QLx]. 14, If R is an integral domain, prove that if f(x) is a unit in
R[x], then f(x) is a constant and is a unit in R.
b) Answer part (a) for f(x) € R[x].
15. Verify that f(x) = 2x + lisaunitin Z,4[x]. Does this con-
c) Answer part (a) for f(x) € C[x].
tradict the result of Exercise 14?
d) Answer parts (a), (b), and (c) for f(x) = x* — 25.
16. Forn € Z*,n > 2, let f(x) € Z,,[x]. Prove that if a, b ¢Z
8. a) Find all roots of f(x) = x? + 4x if f(x) € Zp[x]. and a = b (mod), then f(a) = f(b) (mod n).
b) Find four distinct linear polynomials g(x), h(x), s(x), 17. If F is a field, let SC Fl[x] where f(x) =a,x"+
t(x) € Z;2[x] so that f(x) = g(QX)A(x) = s(x)t (x). Gy x"! 40+) fax? +ayx +a9 eS if and only if a,+
c) Do the results in part (b) contradict the statements made Gn-1 +°++ +4) +4, + a9 = 0. Prove that S is an ideal
of F [x].
in the paragraph following Example 17.7? 18. Let (R, +, +) be a ring. If / is an ideal of R, prove that
9. In each of the following, find the remainder when f(x) is i[x], the set of all polynomials in the indeterminate x with
divided by g(x). coefficients in J, is an ideal in R[x].
17.2
Irreducible Polynomials: Finite Fields
We now wish to construct finite fields other than those of the type (Z,, +, +), where p isa
prime. The construction will use the following special polynomials.
172. Irreducible Polynomials: Finite Fields 807
Definition 17.4 Let f(x) € F[x], with F a field and degree f(x) > 2. We call f(x) reducible (over F) if
there exist g(x), h(x) € F[x], where f(x) = g(x)h(x) and each of g(x), h(x) has degree
> 1. If f(x) is not reducible it is called irreducible, or prime.
Theorem 17.7 contains some useful observations about irreducible polynomials.
THEOREM 17.7 For polynomials in F[x},
a) every nonzero polynomial of degree < 1 is irreducible.
b) if f(x) € F[x] with degree f(x) = 2 or 3, then f(x) is reducible if and only if f(x)
has a root in the field F.
Proof: The proof is left for the reader.
a) The polynomial x? + 1 is irreducible in Q[x] and R[x], but in C[x] we find x? + 1 =
EXAMPLE 17.8
(x + 7)(x — i).
b) Let f(x) = x4 +4+2x?+1¢RL[x]. Although f(x) has no real roots, it is reducible
because (x? + 1)* = x+ + 2x? + 1. Hence part (b) of Theorem 17.7 is not applicable
for polynomials of degree > 3.
c) In Zo[x], f(x) = x3 + x7 4+.x +1 is reducible because f(1) = 0. But g(x) =x? +
x + | 1s irreducible because g(0) = g({1) = 1.
d) Let h(x) = x4 + x3 +47 +x + 1 € Zp[x]. Is A(x) reducible in Z2[x]? Since h(0) =
h(1) = 1, A(x) has no first-degree factors, but perhaps we can finda, b, c, d € Z2 such
that (x* +ax + b)(x* +ex4+d)=x442 4274x411.
By expanding (x? + ax +b)(x* +.cx +d) and comparing coefficients of like
powers of x, we finda+c=1,ac+b+d=1, ad+bc = 1, and bd = 1. With
bd = 1, it follows that b= 1 andd=1, soac+b+d=1l>ac=|l>a=c=
1=>a+ce=0. This contradicts a+c=1. Consequently, A(x) is irreducible
in Z2[x].
All of the polynomials in Example 17.8 share a common property, which we shall now
define.
Definition 17.5 A polynomial f(x) € F[x] is called monic if its leading coefficient is 1, the unity of F.
Some of our next results (up to and including the discussion in Example 17.11) awaken
memories of Chapters 4 and 14.
Definition 17.6 If f(x), g(x) € FLx], then h(x) € F [x] is a greatest common divisor of f(x) and g(x)
a) if h(x) divides each of f(x) and g(x), and
b) if k(x) € F [x] and k(x) divides both f(x), g(x), then k(x) divides A(x).
808 Chapter 17 Finite Fields and Combinatorial Designs
We now state the following results on the existence and uniqueness of what we shall
call the greatest common divisor, which we shall abbreviate as gcd. Furthermore, there is a
method for finding this gcd that is called the Euclidean algorithm for polynomials. A proof
for the first result is outlined in the Section Exercises.
THEOREM 17.8 Let f(x), g(x) € F[x], with at least one of f(x), g(x) not the zero polynomial. Then each
polynomial of minimum degree that can be written as a linear combination of f(x) and
g(x) —that is, in the form s(x)
f (x) + t(x) g(x), for s(x), t(x) € F[x]— will be a greatest
common divisor of f(x), g(x). If we require a gcd to be monic, then it will be unique.
THEOREM 17.9 Euclidean Algorithm for Polynomials. Let f (x), g(x) € F[x] with degree f(x) < degree
g(x) and f(x) # 0. Applying the division algorithm, we write
a(x) = g(x) f(x) +r), degree r(x) < degree f(x)
F(x) =@qiQ®)r@)+ry(), degree r)(x) < degree r(x)
r(x) = qa(x)ri(%) + ro(x), degree r2(x) < degree r) (x)
ry—-2(X) = Ge(X)re_-1(%) + r(x), degree r;, (x) < degree ry_1 (x)
Fe-1(X) = Gey i re (x) + rei), rei(x) = 0.
Then 7; (x), the last nonzero remainder, is a greatest common divisor of f (x), g(x), andis
aconstant multiple of the monic greatest common divisor of f(x), g(x). [Multiplying 7; (x)
by the inverse of its leading coefficient allows us to obtain the unique monic polynomial
we Call the greatest common divisor.|
Definition 17.7 If f(x), g(x) € F[x] and their ged is 1, then f(x) and g(x) are called relatively prime.
The last results we need to construct our new finite fields provide the analog of a con-
struction we developed in Section 14.3.
THEOREM 17.10 Let s(x) € F(x), s(x) # 0. Define relation R on F[x] by f(x) R g(x) if f(x) — g(x) =
t(x)s(x), for some f(x) € F[x]— that is, s(x) divides f(x) — g(x). Then & is an equiva-
lence relation on Fx].
Proof: The verification of the reflexive, symmetric, and transitive properties of & is left for
the reader.
When the situation in Theorem 17.10 occurs, we say that f(x) is congruent to g(x)
modulo s(x) and write f (x) = g(x) (mod s(x)). The relation & is referred to as congruence
modulo s(x).
Let us examine the equivalence classes for one such relation.
Let s(x) =x? +x+1 € Z[x]. Then
EXAMPLE 17.9
a) (0) =f? t+x41 =(0, x2? 4x41 4x? +x, 44+ D072 4x41),..3
= {t(x)(x? +x + 1)|t(x) € Za[x]}
172. Irreducible Polynomials: Finite Fields 809
b) (1) = (1, x24 4, x07 4+x4+D41,04+)D02+x+1)41,...3
= {t(x)(x* +x + 1) + I\t(x) € Z2[x]}
ce) Ix] = fx, x? + 1x? 4+xe4 I 44,04 D027 4x41) 4+4x,...}
= {t(x)(x? +x + 1) + xlt(x) € Zo[x]}
d) [x + 1) = {x + 1,x7, x07? +e 4+ D404), 0400? +x4+1
+(xet1),...}= (G7 4+x4+D4+@ 4 Dit) € Zoix]}
Are these all of the equivalence classes? If f(x) € Z.[x], then by the division algo-
rithm f(x) = q(x)s(x) +r(x), where r(x) =0 or degree r(x) < degree s(x). Since
F(x) — r(x) = q(x)s(x), it follows that f(x) =r(x) (mod s(x)), so f(x) € [r(x)].
Consequently, to determine all the equivalence classes, we consider the possibilities for
r(x). Here r(x) = 0 or degree r(x) < 2, so r(x) = ax + b, where a, b € Z. With only
two choices for each of a, b, there are four possible choices for r(x): 0, 1, x, x + 1.
We now place a ring structure on the equivalence classes of Example 17.9. Recalling
how this was accomplished in Chapter 14 for Z,,, we define addition by [ f (x)] + [g(x)] =
[f(x) + g(x)]. Since degree (f(x) + g(x)) < max{degree f(x), degree g(x)}, we can find
the equivalence class for [ f(x) + g(x)] without too much trouble. Here, for example,
[x] + [x +1) = [x + 4 1)] = [2x + 1] = [1] because2 = 0 in Z.
In defining the multiplication of these equivalence classes, we run into a little more diffi-
culty. For instance, what is [x][x] in Example 17.9? If, in general, we define [ f (x) ][g(x)] =
[f(x)g(x)], it is possible that degree f(x)g(x) > degree s(x), so we may not readily
find [ f (x)g(x)] in the list of equivalence classes. However, if degree f(x)g(x) > degree
s(x), then using the division algorithm, we can write f (x)g(x) = ¢(x)s(x) + r(x), where
r(x) = 0 or degree r(x) < degree s(x). With f(x) g(x) = g(x)s(x) + r(x), it follows that
i (x)g(x) = r(x) (mod s(x)), and we define [ f(x) g(x)] = [r(x)], where [r(x)] does occur
in the list of equivalence classes.
From these observations we construct Tables 17.1 and 17.2 for the addition and multi-
plication, respectively, of {[0], [1], [x], [xy + 1]}. (in these tables we write a for [a].)
Table 17.1 Table 17,2
+ 0 1 x x+1 0 1 x x+1
0 0) l x x+l 0 0 0 0 0
1 l 0 x+1 x 1 0 l x x+ |
Xx x x+1 0 ] Xx 0 x x+1 l
x+1]x«x4+1 x 1 0 x+1/0 «41 l x
From the multiplication table (Table 17.2), we find that these equivalence classes form
not only a ring but also a field, where [1]~! = [1], Lx]! = [x + 1, and [x + 17! = [x].
This field of order 4 is denoted by Z2[x]/(x* + x + 1), and we observe that it contains (an
isomorphic copy of) the subfield Z>. [In general, a subring (R, +, -) of a field (F, +, +)
is called a subfield when (R, +, -) is a field.] In addition, for the nonzero elements of this
field we find that [x]! = [x], [x]? = [x + 1], [x}? = [1], so we have acyclic group of order
3. But the nonzero elements of any field form a group under multiplication, and any group
of order 3 is cyclic, so why bother with this observation? In general, the nonzero elements
of any finite field form a cyclic group under multiplication. (A proof for this can be found
in Chapter 12 of reference [10}.)
810 Chapter 17. Finite Fields and Combinatorial Designs
The preceding construction is summarized in the following theorem. An outline of the
proof is given in the Section Exercises.
THEOREM 17.11 Let s(x) be a nonzero polynomial in F [x].
a) The equivalence classes of F [x] for the relation of congruence modulo s(x) forma
commutative ring with unity under the closed binary operations
[fo] + [ge] = [LF O) + g@)), [Lf )Ilg@] = [Ff @&)g&)] = [rr],
where r(x) is the remainder obtained upon dividing f(x)g(x) by s(x). This ring is
denoted by F[x]/(s(x)).
b) If s(x) is irreducible in F[x], then F[x]/(s(x)) is a field.
c) If |F | = ¢ and degree s(x) = n, then F[x]/(s(x)) contains g” elements.
Before we continue we wish to emphasize that for s(x) irreducible in F[x] the ele-
ments in the field F[.x]/(s(x)) are not simply polynomials (in x). But how can this be, con-
sidering the presence of the symbol x in each of the elements [x] and [x + 1] in the field
Z>[x]/(x? + x + 1) of Example 17.9? In order to make our point more apparent we consider
an infinite example that is somewhat familiar to us.
Here we let F = (R, +, -), the field of real numbers, and we consider the irreducible poly-
EXAMPLE 17.10
nomial s(x) = x? + lin R[x]. From part (b) of Theorem 17.11 we learn that R[x]/(s(x)) =
R[x]/(x? + 1) is a field.
For all f(x) € R[x] it follows by the division algorithm that
f(x) = g(x)(x? +1)+r(x), where r(x) = Oor0 < degr(x) <1.
Therefore,
R[x]/(x* + 1) = {la + bx]la, b € R},
where it can be shown that [a + bx] = [a] + [bx] = [a] + [b][+].
Among the (infinitely many) elements of R[Lx]/(x? + 1) are the following:
1) [1] = {1 + t(x)(x? + 1)|t (x) € RLx]}, where we find the elements x? + 2 and 3x? +
3x + 1 (from R[x]});
2) [r] = {r + t(x) (x? + 1)|t(x) € R[x]}, where r is any (fixed) real number;
3) [-1] = {-1 + t(x)(x? + D|t(x) € R[x]}, where we find the polynomial —1 +
(1)(x* +1) = x* —s0, [x][x] = [x7] = [-1]; and
4) [V2x — 3] = {(V2x — 3) + (x)? + Dit) € REx]}.
Now let us consider the field (C, +, -) of complex numbers and the correspondence
h: R[x]/(Qx? + 1) > C,
where A([a + bx]) =a+ bi.
For all [a+ bx], [ce +dx]eR[x]/(x7 +1), we have fa+bx] =[ctdx] ao
(a + bx) —(c + dx) = t(x)(x? +1), for some t(x) Ee R[x] & (a—c) + (b-d)x =
t(x)(x? + 1). If f(x) is not the zero polynomial, then we have (a — c) + (b — d)x, a poly-
nomial of degree less than 2, equal to f(x)(x* + 1), a polynomial of degree at least 2.
Consequently, t(x) = 0, soa+bx =c+dx and a=c, b=d. This guarantees that the
17.2. Irreducible Polynomials: Finite Fields 811
correspondence given by /: is actually a function. In fact, # is an isomorphism of fields.
(See Exercise 24 in the exercises at the end of this section.) To establish that / preserves
the operation of multiplication, for example, we observe that
h(fa + bx]f[e + dx]}) = A([ac + adx + bex + bdx?])
= h([ac + (ad + be)x] + [bd][x7])
= h(fac + (ad + be)x] + [bd][-1])
= h([ac — bd) + (ad + be)x]})
= (ac — bd) + (ad + bc)i = (a + bi)(e + di)
=h([a + bx})h([e + dx]).
Since R[x]/(x* + 1) is isomorphic to C, the correspondence /([x]) = i makes us think
of [x] as a number in R[x]/(x? + 1) and not as a polynomial in x (in R[x]). The number
[x] represents an equivalence class of polynomials in R[x], and this number [x] behaves
like the complex number i in the field (C, +, -). We should also note that for each real
number r, A([r]) =r, and {[r]|r € R} is a subfield of R[x]/(x? + 1), which is isomorphic
to the subfield R of C.
Finally, if we identify the field R[x ]/ (x? + 1) with the field (C, +, -), we can summarize
what has happened above as follows: We started with the irreducible polynomial s(x) =
x* + 1 in R[x], which had no root in the field (R, +, -). We then enlarged (R, +, +) to
(C, +, -) and in C we found the root i (and the root —i) for s(x), which can now be
factored as (x + i)(x — i) in C[x].
Since our major concern in the chapter is with finite fields, we now examine another
example of a finite field that arises by virtue of Theorem 17.11.
In Z3[x] the polynomial s(x) = x? + x +2 is irreducible because s(0) = 2, s(1) = 1, and
EXAMPLE 17.11
s(2) = 2. Consequently, Z3[x]/(s(x)) is a field containing all equivalence classes of the
form [ax + b], where a, b € Z;. These arise from the possible remainders when a polyno-
mial f(x) € Z3[x] is divided by s(x). The nine equivalence classes are [0], [1], [2], [x],
[x + 1], [x + 2], [2x], [2x + 1], and [2x + 2].
Instead of constructing a complete multiplication table, we examine four sample multi-
plications and then make two observations.
a) [2x][x] = [2x7] = [2x* + 0] = [2x? + (x? +x 4+2)] = [3x27 +e +2] = [x +2]
because3 = Oin Z3.
b) [x+ Lx +2] = [x? + 3x +2]= [x?
+ 2] = [x2 +24 200? + x + 2)] = [24].
c) [2x + 2)? = [4x* + 8x +4] = [x2 4+ 2x + 1) = [(—x —2) 4+ 2x +1] since x7 =
(—x — 2) (mod s(x)), Consequently, [2x + 2]? = [x — 1] = [x + 2].
d) Often we write the equivalence classes without brackets and concentrate on the coef-
ficients of the powers of x. For example, 11 is written for [x + 1] and 21 represents
[2x + 1]. Consequently, (21) - (12) = [2x + 1][x + 2] = [2x7 4+5x+2]=
[2x* + 2x + 2] = [2(—x — 2) + 2x + 2] = [-4 + 2] = [—2] = [1],s0 (21)! = (12).
e) We also observe that
[x]! = [x] [x] = [2x +2] 9 [xP = [2x] [x]? =[x¥ +1]
[xP =[2x+1] [x}* = [2] [x]®=[x+2] [xP =[1]
812 Chapter 17 Finite Fields and Combinatorial Designs
Therefore the nonzero elements of Z3[x]/(s(x)) form a cyclic group under multipli-
cation.
f) Finally, when we consider the equivalence classes [0], [1], [2], we realize that they
provide us with a subfield of Z3[x]/(s(x))
—a subfield we identify with the field
(Z3, +, +).
In Example 17.9 (and in the discussion that follows it) and in Example 17.11, we con-
structed finite fields of orders 4 (= 27) and 9 (= 3%), respectively. Now we shall close this
section as we investigate other possibilities for the order of a finite field. To accomplish this
we need the following idea.
Definition 17.8 Let (R, +, +) be a ring. If there is a least positive integer n such that nr = z (the zero of
R) for all ry € R, then we say that R has characteristic n and write char(R) = n. When no
such integer exists, R is said to have characteristic 0.
a) The ring (Z3, +, -) has characteristic 3; (Z4, +, -) has characteristic 4; in general,
EXAMPLE 17.12
(Z,,, +, +) has characteristic n.
b) The rings (Z, +, -) and (Q, +, -) both have characteristic 0.
c) Aring can be infinite and still have positive characteristic. For example, Z3[x] is an
infinite ring but it has characteristic 3.
d) The ring in Example 17.9 has characteristic 2. In Example 17.11] the characteristic of
the ring is 3. Unlike the examples in part (a), the order of a finite ring can be different
from its characteristic.
Examples 17.9 and 17.11, however, are more than just rings. They are fields with
prime characteristic. Could this property be true for all finite fields?
THEOREM 17.12 Let (F, +, +) bea field. If char(¥) > 0, then char(F’) must be prime.
Proof: In this proof we write the unity of F as u so that it is distinct from the positive integer 1.
Let char(F) = n > 0. If is not prime, we writen = mk, where m, k € Z* andl <m <n,
1 <k <n. By the definition of characteristic, nu = z, the zero of F. Hence (mk)u = z. But
(mk)\(u) =(u+ut---+u)=(Uut+ut---+u)(utut---+u)
= (mu)(ku).
mk summands m summands kK summands
With F a field, (mu)(ku) = z => (mu) = z or (ku) = z. Assume without loss of generality
that ku = z. Then for each rr € F, kr = k(ur) = (ku)r = zr = z, contradicting the choice
of n as the characteristic of F. Consequently, char(F) is prime.
(The proof of Theorem 17.12 actually requires that F only be an integral domain.)
If F isa finite field and m = |F{, then ma = z foralla € F because (F, +) is an additive
group of order m. (See Exercise 8 of Section 16.3.) Consequently, F has positive charac-
teristic and by Theorem 17.12 this characteristic is prime. This leads us to the following
theorem.
172. Irreducible Polynomials: Finite Fields 813
THEOREM 17.13 A finite field F has order p', where p is a prime and ¢ € Z*.
Proof: Since F is a finite field, let char(¥') = p, a prime, and let u denote the unity and z the
zero element. Then So = {u, 2u, 3u,..., pu = z} is a set of p distinct elements in F. If
not, mu = nu forl <m <n< pand(n—m)u =z, withO<n—m < p.Soforallx
e F
we now find that (n — m)x = (n — m)(ux) = [( — m)u]x = zx = z, and this contradicts
char(F) = p. If F = So, then |F{ = p! and the result follows. If not, leta € F — So. Then
S, = {ma +nu{0 < m,n < p} is a subset of F with |S;| < p?. If |S;| < p*, then mya +
nu = ma + nu, withO < m), m2, n|, m2 < p andat least one of m; — m2, nn. —n, #0.
Should m, — m2 = 0, then (m, — m2)a = z = (nm. — n))u, with O < |n2 — n|| < p. Con-
sequently, for all x € F, |n2 — ny{x = |n2 — 1 |(ux) = (|no — ny lu)x = zx =z withO<
|n2 — n|| < p = char(F), another contradiction. Ifn,; — n2 = 0, then (m, — m2)a = z with
0 < |m, — m2| < p. Since F is a field anda # z we know that a“! € F, so |m, — m2|u =
|m, — mo|aa~! = za7! = z with 0 < |m, — m2| < p—yet another contradiction. Hence
neither m1) — m2 nor n, — nz is 0. Therefore, (m, — m2)a = (no —n\)u # z. Choose
keZ* such that O<k < p and k(m, — m2) =1(mod p). Then a = k(m, — m2)a =
k(n2 — n,)u, and a € So, one more contradiction. Hence |S)| = Dp’, and if # = S, the
theorem is proved. [f not, continue this process with an element b € F — S$). Then S) =
{€b +ma+nu|0 < £, m,n < p} will have order p?. (Prove this.) Since F is finite, we
reach a point where F = S,_, for some t € Zt, and |F| = |S,_,| = p’.
As aresult of this theorem there can be no finite fields with orders such as 6, 10, 12, 14,
15, .. .. In addition, for each prime p and eacht € Z", there is really only one field of order
p'. Any two finite fields of the same order are isomorphic. These fields were discovered
by the French mathematician Evariste Galois (1811-1832) in his work on the nonexistence
of formulas for solving general polynomial equations of degree > 5 over Q. As a result, a
finite field of order p’ is denoted by G Fp’), where the letters GF stand for Galois field.
7. An outline for a proof of Theorem 17.8 follows.
EXERCISES 17.2
a) Let S = {s(x) f(x) + t(x)g(x)|s(x), (x) € F[x]}. Se-
1. Determine whether or not each of the following polynomi- lect an element m(x) of minimum degree in S. (Recall that
als is irreducible over the given fields. If it is reducible, provide the zero polynomial has no degree, so it is not selected.)
a factorization into irreducible factors.
Can we guarantee that m(x) is monic?
a) x7 +3x —loverQ,R,C b) Show that if A(x) € F[x] and h(x) divides both f(x)
and g(x), then A(x) divides m(x).
b) x4 —2 over Q,R,C
c) Show that m(x) divides f(x). If not, use the divi-
ce) x? +x+1 over Z3, Zs, Z7 sion algorithm and write f(x) = g(x)m(x) + r(x), where
d) x*+x«3+1 over Z> r(x) # 0 and degree r(x) < degree m(x). Then show that
e) x9 4+ 3x? —x 4+ 1 over Zs r(x) € S and obtain a contradiction.
2. Give an example of a polynomial f(x) € R[x] where f (x) d) Repeat the argument in part (c) to show that m(x) di-
has degree 6, is reducible, but has no real roots. vides g(x).
3. Determine all polynomials f(x) € Zs[x] such that 1 < 8. Prove Theorems 17.9 and 17.10.
degree f(x) <3 and f(x) is irreducible (over Zz).
9. Use the Euclidean algorithm for polynomials to find the gcd
4, Let f(x) = (2x? + 1)(5x3 — 5x + 3)(4x — 3) € Z [x]. of each pair of polynomials, over the designated field F. Then
Write f(x) as the product of a unit and three monic poly- write the gcd as s(x) f(x) + #(x)g(x), where s(x), (x) € Fx].
nomials.
a) f(x) =x? 4+x-2, 9(x) =x —xt 4x3 4x? -
5. How many monic polynomials in Z;[x] have degree 5? x — Lin Q[x]
6. Prove Theorem 17.7. b) f(x) =x44x34-1,
g(x) =x? 4+x4 1inZ[x]
814 Chapter 17 Finite Fields and Combinatorial Designs
c) f(x) = x4 42x? +.2x 4.2, (x) = 2x34 2x7 4+ 17. For p a prime, let s(x) be irreducible of degree n in Z,,[x].
x+1lin Z3[x] a) How many elements are there in the field Z,,[x]/(s(x))?
10. If F is any field, let f(x), g(x) € F[x]. If f(x), g(x) are b) How many elements in Z,[x]/(s(x)) generate the
relatively prime, prove that there is no element a € F with multiplicative group of nonzero elements of this field?
f(a) = 0 and g(a) = 0.
18. Give the characteristic for each of the following rings:
11. Let f(x), g(x) € R[x] with f(x) = x3 4+2x?+ ax —b,
a) Z), b) Zi, [x] ¢) QLx]
g(x) = x3 +x? — bx +a. Determine values for a, b so that
the gcd of f(x), g(x) is a polynomial of degree 2. d) Z[/5] = {a + bV/5|a, b € Z}, under the binary oper-
ations of ordinary addition and multiplication of real num-
12. For Example 17.9, determine which equivalence class bers.
contains each of the following:
19. In each of the following rings, the operations are compo-
a) xtt+x3+x41 nentwise addition and multiplication, as in Exercise 18 of
b) x? +x7+1 Section 14.2. Determine the characteristic in each case.
ce) xttxi tx? +1 a) Z> xX Z3 b) Z3 x Z4 c) Z4 x Ze
13. An outline for the proof of Theorem 17.11 follows. d) Z,,, X Z,,form,néZt,m,n>2
a) Prove that the operations defined in part (a) of The- e) Z; XZ
orem 17.11 are well-defined by showing that if f(x) = 20. For Theorem 17.13, prove that |S2| = p>.
fi(x) (mod s(x)) and g(x) = 2)(x) (mods(x)), then
21. Find the orders n for all fields GF(n), where 100<
f(x) + g(x) = fi) + gi(%) (mod s(x)) and f(x) g(x) = n < 150.
fi (x)gi (x) (mod s(x)).
22. Construct a finite field of 25 elements.
b) Verify the ring properties for the equivalence classes in
F[x]/(s(x)). 23. Construct a finite field of 27 elements.
c) Let f(x) € F[x], with f(x) # O and degree f(x) < de- 24, a) Prove that the function / in Example 17.10 is one-to-
gree s(x). If s(x) is irreducible in F [x], why does it follow one and onto and preserves the operation of addition.
that 1 is the gcd of f(x) and s(x)? b) Let (F, +, -) and(K, ®, ©) be two fields. Ifg: F > K
d) Use part (c) to prove that if s(x) is irreducible in F[x], is a ring isomorphism and a is a nonzero element of F (that
then F[x]/(s(x)) is a field. is, 2 is a unit in F), prove that g(a~') = [g(a)]~!. (Con-
e) If |F| = g and degree s(x) = n, determine the order of sequently, this function g establishes an isomorphism of
fields. In particular, the function 2 of Example 17.10 is
F[x]/(s(x)).
such a function.)
14. a) Show that s(x) = x? + 1 is reducible in Z>[x].
25. a) Let Q[/2] = {a + bvV/2\a, b € Q}. Prove that
b) Find the equivalence classes for the ring Z>[x]/(s(x)).
(Q[/2], +, +) is a subring of the field (R, +, +). (Here the
c) Is Z2[x]/(s(x)) an integral domain? binary operations in R and Q[V2] are those of ordinary
15. For the field in Example 17.11, find each of the following: addition and multiplication of real numbers.)
a) [x + 2][2x +2] + [x + 1] b) Prove that Q[V/2] is a field and that Q[x]/(x? — 2) is
isomorphic to Ql V2].
b) [2x + 1x +2]
26. Let p be a prime. (a) How many monic quadratic (degree
ce) (22)7! = [2x + 2]7'
2) polynomials x* + bx + c in Z,[x] can we factor into linear
16. Let s(x) = x4 42° 4 1 € Zl x]. factors in Z,[x]? (For example, if p = 5, then the polynomial
a) Prove that s(x) is irreducible. x? + 2x + 2in Zs[x] would be one of the quadratic polynomials
b) What is the order of the field Z2[x]/(s(x))? for which we should account, under these conditions.) (b) How
many quadratic polynomials ax* + bx + c in Z,[x] can we fac-
ce) Find [x2 + x + 1]7! in Zo[x]/(s(x)). (Hint: Find a, b,
tor into linear factors in Z,,[x]? (c) How many monic quadratic
c, d € Z, so that [x* +.x + 1][ax? + bx? +ex+d]
polynomials x? + bx +c in Z,[x] are irreducible over Z,?
= [1].) (d) How many quadratic polynomials ax? +bx +c in Z,[x]
d) Determine [x? + x + 1][x? + 1] in Z[x]/(s()). are irreducible over Z,?
173 Latin Squares 815
17.3
Latin Squares
Our first application for this chapter deals with the structure called a Latin square. Such
configurations arise in the study of combinatorial designs and play a role in statistics — in
the design of experiments. We introduce the structure in the following example.
A petroleum corporation is interested in testing four types of gasoline additives to determine
EXAMPLE 17.13
their effects on mileage. To do so, a research team designs an experiment wherein four
different automobiles, denoted A, B, C, and D, are run on a fixed track in a laboratory. Each
run uses the same prescribed amount of fuel with one of the additives present. To see how
each additive affects each type of auto, the team follows the schedule in Table 17.3, where
the additives are numbered 1, 2, 3, and 4. This schedule provides a way to test each additive
thoroughly in each type of auto. If one additive produces the best results in all four types,
the experiment will reveal its superior capability.
The same corporation is also interested in testing four other additives developed for
cleaning engines. A similar schedule for these tests is shown in Table 17.4, where these
engine-cleaning additives are also denoted as 1, 2, 3, and 4.
Table 17.3 Table 17.4
Day Day
Auto | Mon Tues Wed Thurs Auto | Mon Tues Wed = Thurs
A | 2 3 4 A ] 2 3 4
B 2 ] 4 3 B 3 4 l 2
C 3 4 I 2 C 4 3 2 1
D 4 3 2 1 D 2 1 4 3
Furthermore, the research team is interested in the combined effect of both types of
additives. It requires 16 days to test the 16 possible pairs of additives (one for improved
mileage, the other for cleaning engines) in every automobile. If the results are needed in
four days, the research team must design the schedules so that every pair is tested once by
some auto. There are 16 ordered pairs in {1, 2, 3, 4} x {1, 2, 3, 4}, so this can be done in
the allotted time if the schedules in Tables 17.3 and 17.4 are superimposed to obtain the
schedule in Table 17.5. Here, for example, the entry (4, 3) indicates that on Tuesday, auto
C is used to test the combined effect of the fourth additive for improved mileage and the
third additive for maintaining a clean engine.
Table 17.5
Day
Auto Mon Tues Wed Thurs
(,1) @,2) 3,3) 4,4
VAW YS
(2,3) (1,4) 41) G,2)
(3,4) (4,3) d,.2) @,)
(4,2) (3,1) (2,4) (1,3)
816 Chapter 17 Finite Fields and Combinatorial Designs
What has happened here leads us to the following concepts.
Definition 17.9 Ann Xn Latin square is a square array of symbols, usually 1, 2,3,...,, where each
symbol appears exactly once in each row and each column of the array.
a) Tables 17.3 and 17.4 are examples of 4 X 4 Latin squares.
EXAMPLE 17.14
b) For all n > 2, we can obtain ann X n Latin square from the table of the group (Z,,, +)
if we replace the occurrences of 0 by the value of n.
From the two Latin squares in Example 17.13 we were able to produce all of the ordered
pairs in S X S, for S = {1, 2, 3, 4}. We now question whether or not we can do this for
n X n Latin squares in general.
Definition 17.10 Let L; = (4,,), L2 = (b,,;) be two n X n Latin squares, where 1 <i, j <n and each q;,,
b,; €{1, 2,3,..., n}. If the n* ordered pairs (@,;, 5,;), | <i,j <n, are distinct, then L),
Ly are called a pair of orthogonal Latin squares.
a) There is no pair of 2 X 2 orthogonal Latin squares because the only possibilities are
EXAMPLE 17.15
] 2 2 1
Li: > | and L>: 12
b) In the 3 X 3 case we find the orthogonal pair
Table 17.6 123 1 2 3
Li; 2 3 1 and Los: 3 1 2
l 2 3 4 3 1 2 23
1
4 3 2 l ; ;
3 ; 4 3 c) The two 4 X 4 Latin squares in Example 17.13 form an orthogonal pair. The 4 X 4
3 4 \ 2 Latin square shown in Table 17.6 is orthogonal to each of the Latin squares in that
example.
We could continue listing some larger Latin squares, but we’ ve seen enough of them at
this point to ask the following questions:
1) Is there any n > 2 for which there is no pair of orthogonal n x n Latin squares? If
so, what is the smallest such n?
2) For n > 1, what can we say about the number of n X n Latin squares that can be
constructed so that each pair of them is orthogonal?
3) Is there a method to assist us in constructing a pair of orthogonal x X n Latin squares
for certain values of n > 2?
Before we can examine these questions, we need to standardize some of our results.
Definition 17.11 If L is an n Xn Latin square, then L is said to be in standard form if its first row is
1 2 3 ++ on,
173 Latin Squares 817
Except for the Latin square £2 in Example 17.15(a), all the Latin squares we’ ve seen in
this section are in standard form. If a Latin square is not in standard form, it can be put in
that form by interchanging some of the symbols.
The 5 X 5 Latin square shown in (a) is not in standard form. If, however, we replace each
EXAMPLE 17.16
occurrence of 4 with 1, each occurrence of 5 with 4, and each occurrence of | with 5, then
the result is the (standard) 5 X 5 Latin square shown in (b).
42 3 5 1 12 3 4 5
35 4 2 5 3 4 1 2
34 2 15 3 12 5 4
25 1 3 4 24 5 3 1
5 1 4 2 3 45 1 2 3
(a) (b)
It is often convenient to deal with Latin squares in standard form. But will this affect our
results on orthogonal pairs in any way?
THEOREM 17.14 Let L,, L2 be an orthogonal pair of n X n Latin squares. If L,, L2 are standardized as
L¥, L3, then L¥, L3 are orthogonal.
Proof: The proof of this result is left for the reader.
These ideas are needed for the main results of this section.
THEOREM 17.15 Inn € Z*,n > 2, then the largest possible number of n X n Latin squares that are ortho-
gonal in pairs isn — 1.
Proof: Let L,, L2,..., Ly be & distinct n < n Latin squares that are in standard form and
orthogonal in pairs. We write a” to denote the entry in the ith row and jth column of
Lm, Where 1 <i,j <n, |<m<k. Since these Latin squares are in standard form, we
have a” = 1, al” =2,..., and a\”” =n for all 1 < m <k. Now consider aS”, for all
1 <m <k. These entries in the second row and first column are below a”) = |]. Thus
as ; # 1, for all 1 <m<k, or the configuration is not a Latin square. Further, if there
exists |< @<m<k with as? 7 as”, then the pair Ly, L,, cannot be an orthogonal pair.
(Why not?) Consequently, there are at best n — | choices for the a2; entries in any of our
n Xn Latin squares, and the result follows from this observation.
This theorem places an upper bound on the number of n X n Latin squares that are
orthogonal in pairs. We shall find that for certain values of n, this upper bound can be
attained. In addition, our next theorem provides a method for constructing these Latin
squares, though initially not in standard form. The construction uses the structure of a finite
field. Before proving this theorem for the general situation, however, we shall examine one
special case.
Let F ={f,]1
<i <5) =Zs with f/ —1, Pp =2, fp -3, fr
= 4, and fs =5,
the zero
| EXAMPLE 17.17 of Zs.
818 Chapter 17 Finite Fields and Combinatorial Designs
For 1 < k <4, let L; be the5 Samay ®) where
| <i, / <5 and
= ffi t fj.
Whenk = 1, we construct L; = ue! ) as follows. Here ay, = fifi + f,= fi t+ fj, for
1<i,j<5. Withi = 1, the first row of L, is calculated as follows:
ay=Atfi=2 ay=fitfh=3 a, =fit f=
a= fit fr=s av=fitfs=l
The entries in the second row of L, are computed when i = 2. Here we find
ay= f+ fi= ay = f+ f= a)= frt fr=5
Me peget ays= fr + fs=2
Continuing these calculations, we obtain the Latin square L, as
23 4 5 1
3 4 5 1 2
4 5 1 2 3
5 1 2 3 4
12 3 4 5
For k = 2, the entries of L> are given by the formula a,” = fof; + fi = 2f; + fj. To
obtain the first row of L2, we set i equal to | and compute
ay =2fAtfi=3 aS =2fAt+h=4 a =2f,
+ fr =5
a =2fit fr=l ay? =2fit fs=2
When i is set equal to 2, the entries in the second row of L2 are calculated as follows:
ay) =2f+ fi=5 a aS) =2fr+ fr=2
ay? =2f2+ fa =3 ay=2fo+ fs=
Similar calculations fori = 3, 4, and 5 result in the Latin square L> given by
3.4
5 1 2
5 1 2 3 4
23 4 5 #1
4 5 1 2 3
1 2 3 4 5
It is straightforward to check that the two Latin squares L; and L2 are orthogonal. In
Exercise 5 (at the end of this section) the reader will be asked to calculate L3 and £4. Our
next result will verify that the four arrays L,, L2, £3, and £4 are Latin squares and that they
are orthogonal in pairs.
THEOREM 17.16 Letn € Z*,n > 2.1f pisaprime andn = p’', fort € Z*, then there are n — | Latin squares
that are n X n and orthogonal in pairs.
Proof: Let F = GF (p’'), the Galois field of order p’ = n. Consider F = {f\, fo, ..., fr},
where /; is the unity and f,, is the zero element.
123. Latin Squares 819
We construct n — 1 Latin squares as follows.
For each | <k <n—1, let Ly be the n Xn array (aj), 1 <i,j <n, where a tf(kK)
Safi + fi.
First we show that each L, is a Latin square. If not, there are two identical elements of
F in the same row or column of L;. Suppose that a repetition occurs in a column — that is,
ay = ay? for 1 <r,s <n. Then ay =f t+ fi =hihs t+ fi = ay. This implies that
Si fr = Sf, by the cancellation for addition in F. Since k # n, it follows that f, # fy, the
zero of F. Consequently, f;, is invertible, so f, = f, andr = s. A similar argument shows
that there are no repetitions in any row of Ly.
At this point we have n — | Latin squares, £1, L2,..., L,-1. Now we shall prove that
they are orthogonal in pairs. If not, let 1 < k <m <n —1 with
(kK) _ ok (m) __ - os - os
a; = al, qj,” =al™, l<i,j,rns<n, and G, jf) #58).
(Then the same ordered pair occurs twice when we superimpose L,; and L,,.) But
k .
ay =a => fifit fj = fifi + fe, and
ayy = al <> finfir + £5 = Snfir + fi
Subtracting these equations, we find that (f; — fa) fi = (hk - fn) fr. With k Am,
(fi — fin) iS not the zero of F,, so it is invertible and we have f; = f,. Putting this back into
either of the prior equations, we find that f; = f,. Consequently, = r and j = s. Therefore
for k # m, the Latin squares L; and L,, form an orthogonal pair.
The first value of n that is not a power of a prime is 6. The existence of a pair of 6 X 6
orthogonal Latin squares was first investigated by Leonhard Euler (1707-1783) when he
sought a solution to the “problem of the 36 officers.” This problem deals with six different
regiments wherein six officers, each with a different rank, are selected from each regiment.
(There are only six possible ranks.) The objective is to arrange the 36 officers in a 6 X 6
array so that in each row or column of the array, every rank and every regiment is represented
exactly once. Hence each officer in the square array corresponds to an ordered pair (i, j)
where | <i, / < 6, with: for his regiment and ; for his rank. In 1782 Euler conjectured that
the problem could not be solved — that there is no pair of 6 X 6 orthogonal Latin squares.
He went further and conjectured that for all n € Z*, if n = 2 (mod 4), then there is no pair
of n X n orthogonal Latin squares. In 1900 G. Tarry verified Euler’s conjecture for n = 6
by a systematic enumeration of all possible 6 < 6 Latin squares. However, it was not until
1960, through the combined efforts of R. C. Bose, S. S. Shrikhande, and E. T. Parker, that
the remainder of Euler’s conjecture was proved false. They showed that if n € Z* with
n = 2 (mod 4) and n > 6, then there exists a pair of n X n orthogonal Latin squares.
For more on this result and Latin squares in general, the reader should consult the chapter
references.
b) Finda4 X 4 Latin square in standard form that is orthog-
A Se onal to the result in part (a).
1. a) Rewrite the following 4 < 4 Latin square in standard c) Apply the reverse of the process in part (a) to the result
form. in part (b). Show that your answer is orthogonal to the given
1 3 4 2 4 x 4 Latin square.
3 1
24 2. Prove Theorem 17.14.
2 4 3 1
4 7 1 3 3. Complete the proof of the first part of Theorem 17.16.
820 Chapter 17 Finite Fields and Combinatorial Designs
4. The three 4 * 4 Latin squares in Tables 17.3, 17.4, and 17.6 8. A Latin square L is called self-orthogonal if L and its trans-
are orthogonal in pairs. Can you find another 4 X 4 Latin square pose L" form an orthogonal pair.
that is orthogonal to each of these three? a) Show that there is no 3 X 3 self-orthogonal Latin square.
5. Complete the calculations in Example 17.17 in order to ob- b) Give an example of a 4 X 4 Latin square that is self-
tain the two 5 X 5 Latin squares L3 and L4. Rewrite each Latin
orthogonal.
square L,, for 1 <i <4, in standard form.
c) If L = (a,,) is ann Xn self-orthogonal Latin square,
6. Find three 7 X 7 Latin squares that are orthogonal in pairs.
prove that the elements a,,, for 1 <7 <n, must all be dis-
Rewrite these results in standard form.
tinct.
7. Extend the experiment in Example 17.13 so that the research
team needs three 4 X 4 Latin squares that are orthogonal in
pairs.
17.4
Finite Geometries and Affine Planes
In the Euclidean geometry of the real plane, we find that (a) two distinct points determine a
unique line and (b) if 2 is a line in the plane, and P a point not on @, then there is a unique line
’ that contains P and is parallel to £. During the eighteenth and nineteenth centuries, non-
Euclidean geometries were developed when alternatives to condition (b) were investigated.
Yet all of these geometries contained infinitely many points and lines. The notion of a finite
geometry did not appear until the end of the nineteenth century in the work of Gino Fano
(Giornale di Matematiche, 1892).
How can we construct such a geometry? To do so, we return to the more familiar Eu-
clidean geometry. In order to describe points and lines in this plane algebraically, we intro-
duced a set of coordinate axes and identified each point P by an ordered pair (c, @) of real
numbers. This description set up a one-to-one correspondence between the points in the
plane and the set R X R. By using the idea of slope, we could uniquely represent each line
in this plane by either (1) x = a, where the slope is infinite, or (2) y = mx + b, where m is
the slope; a, m, and b are real numbers. We also found that two distinct lines are parallel if
and only if they have the same slope. When their slopes are distinct, the lines intersect ina
unique point.
Instead of using real numbers a, b, c, d, m for the point (c, d) and the lines x =a,
y = mx + b, we now turn to a comparable finite structure, the finite field. Our objective is
to construct what is called a (finite) affine plane.
Definition 17.12 Let ? be a finite set of points, and let & be a set of subsets of %, called lines. A (finite)
affine plane on the sets P and & is a finite structure satisfying the following conditions.
A1) Two distinct points of ? are (simultaneously) in only one element of &; that is, they
are on only one line.
A2) For each € € &, and each P € ? with P ¢ £, there exists a unique element @’ € £
where P ¢€ @’ and @, €’ have no point in common.
A3) There are four points in ?, no three of which are collinear (that is, no three of these
four points are in any one of the subsets € € &).
The reason for condition (A3) is to avoid uninteresting situations like the one shown in
Fig. 17.1. If only conditions (A1) and (A2) were considered, then this system would be an
Figure 17.1 affine plane.
17.4 Finite Geometries and Affine Planes 821
We return now to our construction. Let F = GF (n), where n = p' for some prime p and
t € Z*, In constructing our affine plane, denoted by AP(F), we let P = {(c, d)|c, d € F}.
Thus we have n? points.
How many lines should we have for the set £?
The lines fall into two categories. For a line of infinite slope the equation is x = a, where
a € F. Thus we have n such “vertical lines.” The other lines are given algebraically by
y = mx + b, where m, b € F. With n choices for each of m and b, it follows that there are
n? lines that are not “vertical.” Hence |£| =n? +n.
Before we verify that AP(F), with ? and & as constructed, is an affine plane, we make
two other observations.
First, for each line ¢ € &, if & is given by x =a, then there are n choices for y on
€ = {(a, y)|y € F}. Thus @ contains exactly n points. If £2 is given by y = mx + b, for
m, b € F, then for each choice of x we have y uniquely determined, and again @ consists
of n points.
Now consider any point (c, d) € 9%. This point is on the line x = c. Furthermore, on each
line y = mx + bof finite slope m, d — mc uniquely determines b. With n choices for m, we
see that the point (c, d) is on the nv lines of the form y = mx + (d — mc). Overall, (c, d) is
onn + | lines.
Thus far in our construction of AP(F) we have a set of points and a set & of lines
where (a) |P| = n7; (b) |L| = n* +n; (c) each & € ¥ contains n points; and (d) each point
in ? is on exactly n + 1 lines. We shall now prove that AP (F’) satisfies the three conditions
to be an affine plane.
Al) Let (c, d), (e, f) € P. Using the two-point formula for the equation of a line, we
have
(e—c){(y —-d) =(f —d)(x —c) (1)
as a line on which we find both (c, d) and (e, f). Each of these points is on # + 1
lines. Could there be a second line containing both of them?
The point (c, d) is on the line x = c. If (e, f) is also on this line, then e =
c, but f #d because the points are distinct. With e = c, Eq. (1) reduces to 0 =
(f — d)(x —c), orx = c because f — d # 0, and so we do not have a second line.
With c # e, if (c, d), (e, f) are on a second line of the form y = mx + b,
thend = mc+b, f =me+b,and(f — d) = m(e —c). Ourcoefficients are taken
froma fieldande # c,som = (f —d)(e —c)”"! andb =d—mc=d-—(f —d)-
(e — c)~'c. Consequently, this second line containing (c, d) and (e, f) is
y=(f —dy(e-c)'x +[d-(f —d)\(e-0)7'c]
or, because multiplication in F is commutative, (e — c)(y — d) = (f —d)(x —c),
which is Eq. (1). Thus two points from are on only one line, and condition (A1)
is satisfied.
A2) To verify this condition, consider the point P and the line @ as shown in Fig. 17.2.
Since there are n points on any line, let P|, P2,..., P, be the points of 2. (These
are the only points on @, although the figure might suggest others.) The point P is
not on £, so P and P; determine a unique line £;, for each 1 <i <n. We showed
earlier that each point is on n + | lines, so now there is one additional line @’ with
P on é’ and with £’ not intersecting £.
A3) The last condition uses the field F. Since |#{ > 2, there is the unity | and the
zero element O in F. Considering the points (0, 0), (1, 0), (0, 1), (1, 1), if line 2
822 Chapter 17 Finite Fields and Combinatorial Designs
Figure 17.2
contains any three of these points, then two of the points have the form (c, c), (c, @).
Consequently the equation for £ is given by x = c, which is not satisfied by either
(d, c) or (d, d). Hence no three of these points are collinear.
We have now shown the following.
THEOREM 17.17 If F is a finite field, then the system based on the set ? of points and the set & of lines, as
described above, is an affine plane denoted by AP(F).
Some particular examples will indicate a connection between these finite geometries, or
affine planes, and the Latin squares of the previous section.
For F = (Zo, +, +), we have n = | Fj = 2. The affine plane in Fig. 17.3 has n* = 4 points
EXAMPLE 17.18
and n? +n =6 lines. For example, the line £4 = {(1, 0), (1, 1)}, and £4 contains no other
points that the figure might suggest. Furthermore, £5 and £¢ are parallel lines in this finite
geometry because they do not intersect.
(0,1) (1,1) ,
3 £4
(0,0) (1,0) &
Figure 17.3
Let F = GF(2*) — the field of Example 17.9. Recall the notation of Example 17.11(d) and
EXAMPLE 17.19
write F = {00, 01, 10, 11}, with addition and multiplication given by Table 17.7. We use
this field to construct a finite geometry with n? = 16 points and n? + n = 20 lines. The 20
lines can be partitioned into five parallel classes of four lines each.
Class 1: Here we have the lines of infinite slope. These four “vertical” lines are given
by the equationsx = 00,x = O01,x = 10, andx = II.
Class 2: For the “horizontal” class, or class of slope 0, we have the four lines y = 00,
y = 01, y = 10, and y = 11.
17.4 Finite Geometries and Affine Planes 823
Table 17.7
+ 00 01 10 11 . 00 01 10 11
00 00 01 10 11 00 00 00 00 00
01 01 00 11 10 Ol 00 01 10 11
10 10 11 00 01 10 00 10 11 Ol
11 11 10 01 00 ll 00 11 01 10
Class 3: The lines with slope 01 are those whose equations are y = Olx + 00, y =
Olx +01, y=Olx + 10, and y = Olx +11.
Class 4: This class consists of the lines with equations y = 10x + 00, y = 10x + O1,
y = 10x + 10, and y = 10x + 11.
Class 5: The last class contains the four lines given by y = 1lx + 00, y = 1lx + 01,
y = 11x +10, andy = Illx +11.
Since each line in A P(F’) contains four points and each parallel class contains four lines,
we shall see now how three of these parallel classes partition the 16 points of AP(F).
NY (00,11) (01,11) (10,11) (11,11)
Ys
(00,10) (01,10)
YS
(10,10) (11,10)
2 1 4 3
(00,01) (01,01) (10,01) (11,01)
po
YN 00) (01,00)
YS
(10,00) (11, mo)
Figure 17.4
For the class with m = O1, there are four lines: (1) y = Olx + 00; (2) y = Olx + O1;
(3) y = Olx + 10; and (4) y = Olx + 11. Above each point in A P(F) we write the number
corresponding to the line it is on. (See Fig. 17.4.) This configuration can be given by the
following Latin square:
3 2 1
+
4 1 2
Ww
] 4 3
Ny
1 2 3 4
lf we repeat this process for classes 4 and 5, we get the partitions shown in Figs. 17.5
and 17.6, respectively. In each class the lines are listed, for the given slope, in the same
order as for Fig. 17.4. Within each figure is the corresponding Latin square.
These figures give us three 4 X 4 Latin squares that are orthogonal in pairs.
824 Chapter 17 Finite Fields and Combinatorial Designs
4 2 1 3 4 1 3 2
e e e e e e ® e
(00,11) (01,11) (10,11) (11,11) (00,11) (01,11) (10,11) (11,11)
3 1 2 4 3 2 4 1
e e e e e e e e
(00,10) (01,10) (10,10) (11,10) (00,10) (01,10) (10,10) (11,10)
2 4 3 1 2 3 1 4
e e e @ e e e e
(00,01) (01,01)
01,01 (10,01)
10,01 (11,01)
11,01 4213 (00,01)1 ( 01,01 ) 10,01 ) (11,01)
11,01 4132
3124 3 2 4 1
1 4 2 3
i 3 : 2 243 1 e e e e 23 14
(00,00) (01,00) (10,00) (11,00) 13 4 2 (00,00) (01,00) (10,00) (11,00) 1 4 2 3
Figure 17.5 Figure 17.6
The results of this example are no accident, as demonstrated by the following theorem.
THEOREM 17.18 Let F = GF(n), where n > 3 andn = p', pa prime, t € Z*. The Latin squares that arise
from AP(F) for the n — 1 parallel classes, where the slope is neither 0 nor infinite, are
orthogonal in pairs.
Proof: A proof of this result is outlined in the Section Exercises.
EXERCISES 17.4
1. Complete the following table dealing with affine planes.
Number of Number of
Field Number of Points Number of Lines Points on a Line Lines on a Point
25
GF(3”)
56
17
31
2. How many parallel classes do each of the affine planes in c) The line in AP(F), where F = GF (27), that is parallel
Exercise | determine? How many lines are in each class? to 10y = 11x + 01 and contains (11, 01). (See Table 17.7.)
3. Construct the affine plane AP(Z3). Determine its parallel 6. Suppose we try to construct an affine plane AP(Z,) as we
classes and the corresponding Latin squares for the classes of did in this section.
finite nonzero slope. a) Determine which of the conditions (A1), (A2), and (A3)
fail in this situation.
4. Repeat Exercise 3 with Zs taking the place of Zs.
b) Find how many lines contain a given point P and how
5. Determine each of the following lines. many points are on a given line @, for this “geometry.”
a) The line in AP(Z-;) that is parallel to y = 4x +2 and 7. The following provides an outline for a proof of Theorem
contains (3, 6). 17.18.
b) The line in AP(Z,) that is parallel to 2x + 3y +4=0 a) Consider a parallel class of lines given by y = mx +b,
and contains (10, 7). where m € F, m # 0. Show that each line in this class inter-
175 Block Designs and Projective Planes 825
sects each “vertical” line and each “horizontal” line in ex- slope, are orthogonal, assume that an ordered pair (i, j) ap-
actly one point of AP(F). Thus the configuration obtained pears more than once when one square is superimposed upon
by labeling the points of AP(F), as in Figs. 17.4, 17.5, and the other. How does this lead to a contradiction?
17.6, is a Latin square.
b) To show that the Latin squares corresponding to two dif-
ferent classes, other than the classes of slope 0 or infinite
17.5
Block Designs and Projective Planes
In this final section, we examine a type of combinatorial design and see how it is related to
the structure of a finite geometry. The following example will illustrate this design.
EXAMPLE 17.20 Dick (d) and his wife Mary (m) go to New York City with their five children — Richard (r),
Peter (p), Christopher (c), Brian (b), and Julie (j). While staying in the city they receive
three passes each day, for a week, to visit the Empire State Building. Can we make up a
schedule for this family so that everyone gets to visit this attraction the same number of
times?
The following schedule is one possibility.
1) b,c,d 2) b, j,r 3) b, m, p 4) c,j,m
5) c, p,r 6) d,j, p 7) d,m,r
Here the result was obtained by trial and error. For a problem of this size such a technique
is feasible. However, in general, a more effective strategy is needed. Furthermore, in asking
for a certain schedule, we may be asking for something that doesn’t exist. In this problem,
for example, each pair of family members is together on only one visit. If the family had
received four passes each day, we would not be able to construct a schedule that maintained
this property.
The situation in this example generalizes as follows.
Definition 17.13 Let V be a set with v elements. A collection {B|, Bo, ..., B,} of subsets of V is called a
balanced incomplete block design, or (v, b, r, k, 4)-design, if the following conditions are
satisfied:
a) For each 1 <i < b, the subset B, contains k elements, where k is a fixed constant and
kK<uv.
b) Each element x € V is inr (< b) of the subsets B;, 1 <i <b.
c) Every pair x, y of elements of V appears together in A (<b) of the subsets B;,
l<i<b.
The elements of V are often called varieties because of the early applications in the design
of experiments that dealt with tests on fertilizers and plants. The b subsets B,, B2,..., By
of V are called blocks, where each block contains k varieties. The number r is referred to as
the replication number of the design. Finally, 4 is termed the covalency for the design. This
parameter makes the design balanced in the following sense. For general block designs we
have a number A,, for each pair x, y ¢ V; if A,, is the same for all pairs of elements from
826 Chapter 17 Finite Fields and Combinatorial Designs
V, then A represents this common measure and the design is called balanced. In this text
we only deal with balanced designs.
EXAMPLE 17.21 a) The schedule in Example 17.20 is an example of a (7, 7, 3, 3, 1)-design.
b) For V = {1, 2, 3, 4, 5, 6}, the ten blocks
12 4 1 3 4 1 5 6 2 3 6 3 4 6
12 6 1 3 5 2 3 5 2 4°5 4
constitute a (6, 10, 5, 3, 2)-design.
c) If F is a finite field, with | F| = a, then the affine plane AP(F) yields an
(n*?,ne?tn ntl,n, 1)-design. Here the varieties are the n? points in AP(F); the
n* +n lines are the blocks of the design.
At this point there are five parameters determining our design. We now examine how
these parameters are related.
THEOREM 17.19 Fora (v, b, r, k, 4)-design, (1) vr = bk and (2) A(v — 1) = r(k — 1).
Proof:
1) With 4 blocks in the design and & elements per block, listing all the elements of the
blocks, we get bk symbols. This collection of symbols consists of the elements of V
with each element appearing r times, for a total of vr symbols. Hence vr = bk.
2) For this property we introduce the pairwise incidence matrix A for the design. With
|V| = v, let t = (5), the number of pairs of elements in V. We construct the ¢ X b
matrix A = (a;;) by defining a,;; = | if the ith pair of elements from V is in the jth
block of the design; if not, aj; = 0.
B, Bp By
X1X2 ay] a12 vt ap
X1X3 a2 a22 a arp
NX Xy Gy-1] Gy-12 *** Gy-1b
X2X3 ay | ay? cee avb
Xy—1Xy Lr 1 a2 a arb J
We now count the number of |’s in matrix A in two ways.
a) Consider the rows. Since each pair x;, x;, for 1 <i < j < v, appears in A blocks, it
follows that each row contains 2 1’s. With t rows in the matrix, the number of 1’s is
then At = Av(v — 1)/2.
b) Now consider the columns. As each block contains k elements, this determines () =
k(k — 1)/2 pairs, and this is the number of 1’s in each column of matrix A. With b
columns, the total number of 1’s is bk(k — 1)/2.
Then, Av(v — 1)/2 = bk(k — 1)/2 = or(k — 1)/2, so A(v — 1) = r(k — 1).
175 Block Designs and Projective Planes 827
As we mentioned earlier, when n is a power of a prime, an (n?, n? +n,n+1,n, 1)-
design can be obtained from the affine plane AP(¥), where F = GF (n). Here the points
are the varieties and the lines are the blocks. We shall now introduce a construction that
enlarges A P(F’) to what is called a finite projective plane. From this projective plane we can
construct an (n?7 +nr+1,n7+n+1,n+1,n+ 1, 1)-design. First let us see how these
two kinds of planes compare.
Definition 17.14 If P’ is a finite set of points and &’ a set of lines, each of which is a nonempty subset of
’, then the (finite) plane based on 9’ and £&’ is called a projective plane if the following
conditions are satisfied.
P1) Two distinct points of ’ are on only one line.
P2) Any two lines from &’ intersect in a unique point.
P3) There are four points in 9’, no three of which are collinear.
The difference between the affine and projective planes lies in the condition dealing with
the existence of parallel lines. Here the parallel lines of the affine plane based on and &
will intersect when the given system is enlarged to the projective plane based on 9’ and L’.
The construction proceeds as follows.
Start with an affine plane AP(F) where F = GF(n). For each point (x, y) € X, rewrite
EXAMPLE 17.22
the point as (x, y, 1). We then think of the points as ordered triples (x, y, z) where z = 1.
Rewrite the equations of the lines x = c and y= mx +b in AP(F) as x =cz and y =
mx + bz, where z = |. We still have our original affine plane A P(/’), but with a change of
notation.
Add the set of points {(1, 0, 0)} U {(x, 1, 0)|x € F} to P to get the set P’. Then |P’| =
n> +n + 1. Let £., be the subset of ’ consisting of these new points. This new line can be
given by the equation z = 0, with the stipulation that we never have x = y = z = 0. Hence
(0, 0, 0) ¢ FP’.
Now let us examine these ideas for the affine plane A P(Z2). Here ? = {(0, 0), (1, 0),
(0, 1), (1, 1)}, so
P' = {(0, 0, 1), U0, 1), 0, 1, 1), C1, 1, 1D}U fC, 0, 0), CO. 1, 0), C1, 1, 0}.
The six lines in & were originally
x =0:{0,0),0,D} y= 0:{,0), 1,0} y= x: (0, 0), 1, 1D}
x=1:{0,0,0,D} y=l:{O,D,d.D} y=x4th {O, 1, 1.09}
We rewrite these as
x =0 y=0 yHx xX =Z y=2z yHxt+z
and add a new line £,, defined by z = 0. These constitute the set £’ of lines for our projective
plane. And now at this point we consider z as a variable. Consequently, the line x = z
consists of the points (0, 1, 0), (1, 0, 1), and (1, 1, 1). In fact, each line of & that contained
828 Chapter 17 Finite Fields and Combinatorial Designs
two points will now contain three points when considered in L’. The set L’ consists of the
following seven lines.
x = 0: {(0, 0, 1), (0, 1, 0), , 1, 1D} y=z:{d, 0,0), (0, 1, 1), Cl, 1, 1}
y = 0: {(0, 0, 1), C1, 0, 0), C1, 0, 1D} y =x: {(0,0, 1), Cd, 1,0), Cl, 1, 1}
x =z: {(0, 1,0), (1, 0, 1), (1. 1, 1D} y=x4tz2:{0, 1, D, d, 1,0), 1, 0, 1}
z= 0 (£20): {C1, 0, 0), (, 1, 0), CL, 1, 0)}
In the original affine plane the lines x = 0 and x = 1 were parallel because no point
in this plane satisfied both equations simultaneously. Here in this new system x = 0 and
= z intersect in the point (0, 1, 0), so they are no longer parallel in the sense of A P(Z3).
Likewise, y = x and y = x + 1 were parallel in AP(Z.), whereas here the lines y = x
and y = x + z intersect at (1, 1, 0). We depict this projective plane based on #’ and £’ as
shown in Fig. 17.7. Here the “circle” through (1, 0, 1), (1, 1, 0), and (0, 1, 1) is the line
y = x + z. Note that every line intersects &,,, which is often called the /ine at infinity. This
line consists of the three points at infinity. We define two lines to be parallel in the projective
plane when they intersect in a point at infinity (or on £,,).
(1,0,0) (0,1,1) | (11,1)
z= OE...) YTrZl X=Z
Figure 17.7
This projective plane provides us with a (7, 7, 3, 3, 1)-design like the one we developed
by trial and error in Example 17.20.
We generalize the results of Example 17.22 as follows: Let n be a power of a prime. The
affine plane AP(F), for F = GF(n), provides an example of an (n”, n? +n, n+ 1,n, 1)-
design. In A P(F) the n? +n lines fall into n + 1 parallel classes. For each parallel class
we add a point at infinity to AP(F’). The point (0, 1, 0) is added for the class of lines
x = cz, c € F; the point (1, 0, 0) for the class of lines y = bz, b € F. When m € F and
m # 0, then we add the point (m~!, 1, 0) for the class of lines y = mx + bz, b € F. The
line at infinity, @.,, is then defined as the set of n + 1 points at infinity. In this way we
obtain the projective plane over GF (n), which has n? + n + | points and n? + n + 1 lines.
Here each point is on n + 1 lines, and each line contains n + 1 points. Furthermore, any
two points in this plane are on only one line. Consequently, we have an example of an
(Wr tnati,nt+n4t1,n+1,n41, 1)-design.
175 Block Designs and Projective Planes 829
b) ? ={(x, y,2Ix, vy, ze R} =R
EXERCISES 17.5 ' is the set of all lines in R°.
1, Let V = {1, 2,..., 9}. Determine the values of v, b, r, k, c) ’ is the set of all lines in R? that pass through (0, 0, 0).
and A for the design given by the following blocks. £ is the set of all planes in R? that pass through
(0, 0, 0).
126 147 234 279 378 468
11. Bowling teams of five students each are formed from aclass
135 189 258 369 459 567 of 15 college freshmen. Each of the students bowls on the same
2. Find an example of a (4, 4, 3, 3, A)-design. number of teams; each pair of students bowls together on two
3. Find an example of a (7, 7, 4, 4, 4)-design. teams. (a) How many teams are there in all? (b) On how many
4. Complete the following table so that the parameters v, b, different teams does each student bowl?
r,k, A in any row may be possible for a balanced incomplete
12, Mrs. Mackey gave her computer science class a list of 28
block design.
problems and directed each student to write algorithms for the
solutions of exactly seven of these problems. If each student did
v b r k A as instructed and if for each pair of problems there was exactly
one pair of students who wrote algorithms to solve them, how
4 3 2 many students did Mrs. Mackey have in her class?
9 12 3 13. Consider a (v, b, r, k, 4)-design on the set V of varieties,
where |V| = v > 2. If x, y © V, how many blocks in the design
10 9 2 contain either x or y?
13 4 4 14, In a programming class Professor Madge has a total of n
students, and she wants to assign teams of m students to each
30 10 3
of p computer projects. If each student must be assigned to the
same number of projects, (a) in how many projects will each
5. Is it possible to have a (v, b, r, k, 4)-design where individual student be involved? (b) in how many projects will
(a)b = 28,r =4,k = 3? (b)v =17,r=8,k =5? each pair of students be involved?
6. Given a (v, b, r, k, 4)-design with b = v, prove that if v is
even, then A is even. 15. a) If a projective plane has six lines through every point,
how many points does this projective plane have in all?
7. A (v, b, r, k, A)-design is called a triple system if k = 3.
When & = 3 and A = 1, we call the design a Steiner triple b) If there are 57 points in a projective plane, how many
system. points lie on each line of the plane?
a) Prove that in every triple system, A(v — 1) is even and 16. In constructing the projective plane from AP(Z>) in Ex-
Av(v — 1) is divisible by 6. ample 17.22, why didn’t we want to include the point (0, 0, 0)
in the set P’?
b) Prove that in every Steiner triple system, v is congruent
to | or 3 modulo 6. 17. Determine the values of v, b, r, k, and 4 for the balanced
8. Verify that the following blocks constitute a Steiner triple incomplete block design associated with the projective plane
system on nine varieties. that arises from AP(F) for the following choices of F: (a) Zs
128 147 234 279 389 468 (b) Z; (c) GF (8).
135 169 256 367 459 578 18. a) List the points and lines in A P(Z3). How many paral-
9, In a Steiner triple system with b = 12, find the values of v lel classes are there for this finite geometry? What are the
and r. parameters for the associated balanced incomplete block
design?
10. In each of the following, ?’ is a set of points and &’ a set
of lines, each of which is a nonempty subset of 9’. Which of b) List the points and lines for the projective plane that
the conditions (P1), (P2), and (P3) of Definition 17.14 hold for arises from A P(Z3). Determine the points on &,,, and use
the given P’ and £’? them to determine the “parallel” classes for this geometry.
What are the parameters for the associated balanced incom-
a) ?' = {a, b, c}
plete block design?
£' = {{a, b}, {a, c}, {b, c}}
830 Chapter 17 Finite Fields and Combinatorial Designs
17.6
Summary and Historical Review
The structure of a field was first developed in Chapter 14. In this chapter we examined
polynomial rings and their role in the structure of finite fields, directing our attention to
applications in finite geometries and combinatorial designs.
In Chapter 15 we saw that the order of a finite Boolean algebra could only be a power
of 2. Now we find that for a finite field the order can only be a power of a prime and that
for each prime p and each n € Z", there is only one field, up to isomorphism, of order p”.
This field is denoted by GF(p"), in honor of the French mathematician Evariste Galois
(1811-1832).
M6
Evariste Galois (1811-1832)
The finite fields (Z,, +, +), for p a prime, were obtained in Chapter 14 by means of
the equivalence relation, congruence modulo p, defined on Z. Using these finite fields, we
developed here the integral domains Z,,[x]. Then, with s(x) an irreducible polynomial of
degree n in Z,,[x], a similar equivalence relation— namely, congruence modulo s(x) —
gave us a set of p” equivalence classes, denoted Z,,[x]/(s(x)). These p” equivalence classes
became the elements of the field GF (p”). (Although we did not prove every possible result
in general, it can be shown that over the finite field Z,,, there is an irreducible polynomial
of degree n for each n € Z*.)
The theory of finite fields was developed by Galois in his work addressing the problem of
the solutions of polynomial equations. As we mentioned in the summary of Chapter 16, the
study of polynomial equations was an area of research that challenged many mathematicians
from the sixteenth to the nineteenth centuries. In the nineteenth century, Niels Henrik Abel
(1802-1829) first showed that the solution of the general quintic could not be given by
radicals. Galois showed that for any polynomial of degree n over a field F, there is a
corresponding group G that is isomorphic to a subgroup of S,,, the group of permutations
of {1, 2, 3,..., m}. The essence of Galois’s work is that such a polynomial equation can be
solved by (addition, subtraction, multiplication, division, and) radicals if its corresponding
group is solvable. Now what makes a finite group solvable? We say that a finite group G is
solvable if it has a chain of subgroups G = K, D K7 D K3D---+ DK; = {e}, where for all
References 831
2 <i <t, K; isanormal subgroup of K;_, (that is, xyx! € K; for all y € K; and for all
x € K;_,), and the quotient group K;_,/K, is abelian. One finds that all subgroups of S;,
for | <i <4, are solvable, but for n > 5 there are subgroups of S,, that are not solvable.
Though it seems that Galois theory is concerned predominantly with groups, there is
a great deal more on the theory of fields that we have not mentioned. As a consequence
of Galois’s work, the areas of field theory and finite group theory became topics of great
mathematical interest.
For more on Galois theory, the reader will find Chapter 6 of the text by V. H. Larney
[8] and Chapter 12 in the book by N. H. McCoy and T. R. Berger [10] good places to start.
Chapter 5 of I. N. Herstein [6] has more on the topic, while a detailed presentation can be
found in the text by S. Roman [11] and the classic work by O. Zariski and P. Samuel [17].
Appendix Ein the text by V. H. Larney [8] includes an interesting short account of the life of
Galois; more on his life can be found in the somewhat fictional account by L. Infeld [7]. The
article by T. Rothman [12] provides a more contemporary discussion of the inaccuracies
and myths surrounding the life, and especially the death, of Galois. The biographical notes
on pages 287-291 of the text by J. Stillwell [14] relate more on the life and work of this
great gemius.
The Latin squares, combinatorial designs, and finite geometries of the later sections of the
chapter showed us how the finite field structure entered into problems of design. Dating back
to the time of Leonhard Euler (1707-1783) and the problem of the “36 officers,” the study
of orthogonal Latin squares has been developed considerably since 1900, and especially
since 1960 with the work of R. C. Bose, S. S. Shrikhande, and E. T. Parker. Chapter 7 of
the monograph by H. J. Ryser [13] provides the details of their accomplishments. The text
by C. L. Liu [9] includes ideas from coding theory in its discussion of Latin squares.
The study of finite geometries can be traced back to the work of Gino Fano, who, in
1892, considered a finite three-dimensional geometry consisting of 15 points, 35 lines, and
15 planes. However, it was not until 1906 that these geometries gained any notice, when
O. Veblen and W. Bussey began their study of finite projective geometries. For more on this
topic, the reader should find the texts by A. A. Albert and R. Sandler [1] and H. L. Dorwart
[4] very interesting. The text by P. Dombowski [3] provides an extensive coverage for those
seeking something more advanced.
Finally, the notion of designs was first studied by statisticians in the area called the design
of experiments. Through the research of R. A. Fisher and his followers, this area has come to
play an important role in the modern theory of statistical analysis. In our development, we
examined conditions under which a (v, b, r, k, 4)-design could exist and how such designs
were related to affine planes and finite projective planes. The text by M. Hall, Jr. [5] provides
more on this topic, as does the work by A. P. Street and W. D. Wallis [15]. Chapter XIII of
reference [15] includes material relating to designs and coding theory. A rather thorough
coverage of the topic of designs is given in the work by W. D. Wallis [16], and the text
edited by J. H. Dinitz and D. R. Stinson [2] provides the reader with a collection of more
work in this area.
REFERENCES
1. Albert, A. Adrian, and Sandler, R. An Introduction to Finite Projective Planes. New York:
Holt, 1968.
2. Dinitz, Jeffrey H., and Stinson, Douglas R., eds. Contemporary Design Theory. New York:
Wiley, 1992.
3. Dombowski, Peter. Finite Geometries. New York: Springer-Verlag, 1968.
832 Chapter 17 Finite Fields and Combinatorial Designs
Dorwart, Harold L. The Geometry of Incidence. Englewood Cliffs, N.J.: Prentice-Hall, 1966.
an We
Hall, Marshall, Jr. Combinatorial Theory. Waltham, Mass.: Blaisdell, 1967.
Herstein, [srael Nathan. Topics in Algebra, 2nd ed. Lexington, Mass.: Xerox College Publish-
ing, 1975.
. Infeld, Leopold. Whom the Gods Love. New York: McGraw-Hill, 1948.
ownmon~
. Larney, Violet H. Abstract Algebra: A First Course. Boston: Prindle, Weber & Schmidt, 1975.
. Liu, C. L. Topics in Combinatorial Mathematics. Mathematical Association of America, 1972.
. McCoy, Neal H., and Berger, Thomas R. Algebra: Groups, Rings, and Other Topics. Boston:
Allyn and Bacon, 1977.
Il. Roman, Steven. Field Theory. New York: Springer-Verlag, 1995.
12. Rothman, Tony. “Genius and Biographers: The Fictionalization of Evariste Galois.” The
American Mathematical Monthly 89, no. 2 (1982): pp. 84-106.
. Ryser, Herbert J, Combinatorial Mathematics. Carus Mathematical Monographs, Number 14,
Mathematical Association of America, 1963.
14, Stillwell, John. Mathematics and Its History. New York: Springer-Verlag, 1989.
15. Street, Anne Penfold, and Wallis, W. D. Combinatorial Theory: An Introduction. Winnipeg,
Canada: The Charles Babbage Research Center, 1977,
16. Wallis, W. D. Combinatorial Designs. New York: Marcel Dekker, Inc., 1988.
17. Zariski, Oscar, and Samuel, Pierre. Commutative Algebra, Vol. 1. New York: Van Nostrand,
1958.
6. For any field F, let f(x) = x" + dy xP tee bax t+
SUPPLEMENTARY EXERCISES ao € F[x].Ifr,,m,..., 7, are the roots of f(x), andr, € F for
all | <i <n, prove that
a) —a,~) = 7 Ht roa tee +Pp.
1, Determine n if over GF (n) there are 6561 monic polyno-
mials of degree 5 with no constant term. b) (-1)"dp = rirz-+ +The
7, Four of the seven blocks in a (7, 7, 3, 3, 1)-design are
2. a) Let f(x) = anx" +---+ a,x +a € Z[x]. If r/s € Q,
{1, 3, 7}, {1, 5, 6}, {2, 6, 7}, and {3, 4, 6}. Determine the other
with gcd(r, s) = Land f(r/s) = 0, prove thats|a, andr |ao.
three blocks.
b) Find the rational roots, if any exist, of the following
polynomials over Q. Factor f(x) in Q[x]. 8. Find the values of b and r for a Steiner triple system where
i) f(x) = 2x3 +3x? -2x -3 v = 63.
ii) f(x) =xt+x°— x? —2x -2
9. a) If a projective plane has 73 points, how many points lie
c) Show that the polynomial f(x) = x! — °° 4 x70 4
on each line?
x? + 1 has no rational root.
b) If each line in a projective plane passes through 10
3. a) For how many integers n, where | <n < 1000, can we
points, how many lines are there in the projective plane?
factor f(x) = x? + x — n into the product of two first de-
gree factors in Z[x]? 10. A projective plane is coordinatized with the elements of a
field F. If this plane contains 91 lines, what are | F| and char(F)?
b) Answer part (a) for f(x) = x7 + 2x — 7.
c) Answer part (a) for f(x) = x? + 5x —n. 11. Let V = {x,, x2,...,x,} be the set of varieties and
d) Let g(x) =x? +kx—neZ[x], for 1<n < 1000. {B,, Bo, ..., By} the collection of blocks for a (v, b, r, k, A)-
Find the smallest positive integer k so that g(x) cannot design. We define the incidence matrixA for the design by
be factored into two first degree factors in Z[x] for all
1 <n < 1000. 1, ifx,EeB
A = (4); )uxps where a,, = {5 otherwise
4. Verify that the polynomial f(x) =x*+x°+x41 is
reducible over every field F (finite or infinite).
a) How many 1’s are there in each row and column of A?
5. If p is a prime, prove that in Z,[x],
b) Let Jmxn be the m Xn matrix where every entry is 1.
For J,x, we write J/,. Prove that for the incidence matrix
xP x= lla -a.
aehy A, A- Ji, =F. Joxp and Jy -A=Kk. Jyxp-
Supplementary Exercises 833
c) Show that 12, Given a (v, b, r, k, 4)-design based on the v varieties of
r d d X V, replace each of the blocks B,, for 1 <i < 5, by its comple-
d r Rowee ment B, = V — B,. Then the collection {B,, Bo,..., B;}
A-Av =} 2 X poe provides the blocks fora (v, b, r’, k’, 4’)-design, also based on
the set V.
ry r Are a) Find this corresponding complementary (v, b, r’, k’,
=(r—A)I, +AS A’)-design for the design given in Exercise 1 of Section 17.5.
where [,, is the v X v (multiplicative) identity. b) In general, how are the parameters r’, k’, X’ of the com-
plementary design related to the parameters vu, b, r, k, 4 of
d) Prove that the original design?
det(A . A‘) = (r _ Ay Tir 4 (v _ 1)A] _— (r _ AY’ Tek.
Appendix 1
Exponential and
Logarithmic
Functions
Troe the study of mathematics and computer science, one confronts exponential and loga-
rithmic functions. The function concept is introduced in Section 5.2, and in part (d) of Exercise
15 for that section we find the function f: R > R, where f(x) = e* for x € R. This is an example
of an exponential function. Then in Example 5.61 we come across the function f: R > R*, where
f (x) = e* — this time in conjunction with a logarithmic function, denoted In x, where x € R™. Later,
in Example 5.73 of this same chapter, another logarithmic function— namely, log, n, for n €« Z* —
appears in the analysis of an algorithm. And since these types of functions occur in later chapters as
well, we now provide this appendix as a review of some of the fundamental properties of these two
kinds of functions.
Let us start with the idea of positive integer exponents. For instance, we know that the expression
3’ indicates the multiplication of seven 3’s— that is,
37 =3.3-3-3-3-3-3
= 2187.
In this example, the number 3 is called the base of 3’; the number 7 is the exponent, or power.
Generally, when the exponent is a positive integer, the base — call it b—can be any real number
(including 0). In dealing with an exponent that is a negative integer, we use the following definition.
Definition A1.1 For every y nonzero real number b and every y n € Z*, we have b=” = 1/b".
From Definition Al.1 we see that
EXAMPLE Al.1
a) 3-7 = 1/37 = 1/2187 b) (1/2)-* = 1/(1/2)8 = 1/1/64) = 64
c) (—3/5)~5 = 1/(—3/5)5 = 1/(—243/3125) = -3125/243
Finally, when our exponent is the integer 0 we define 6° = 1, for any nonzero’ real number b.
The preceding ideas can be summarized in the following, where we use the idea of a recursive
definition (introduced in Section 2 of Chapter 4) in the first part:
For all bE R,
“The expression 0° is called an indeterminate form since its value may be different in different situations. This
idea is studied in calculus and is covered in conjunction with L’Hospital’s Rule.
A-2 Appendix 1 Exponential and Logarithmic Functions
1) b' = b, and b" =b-b""', forn € Zt wheren > 1;
2) ifb #Oandn eZ", then b-” = 1/b"; and
3) ifb #0, then b® = 1.
In order to proceed from integer exponents to those that are rational numbers, we recall from
earlier work in algebra that if g ¢ Z*, where g > 1, and b is any nonnegative real number, then the
expression b!/? denotes the gth root of b. Hence b!/¢ is the real numbera where a4 = b. For example,
32/5 = 2 because 2° = 32, and ~—(1/8)!/7 = 1/2 because (1/2)? = 1/8.
But when we are confronted with the equations 2” = 4 and (—2)? = 4, we must ask ourselves what we
shall mean here by 4'/*, The convention that is followed names the positive root as the one represented
by 41/7, so 4!/2 = 2, not —2 or 2. Likewise, 9!/* = 3, 16!/* = 4, and for all r € R, (r)'/? = |r|, the
absolute value of r, not just plain r. Also, though 2* = (—2)* = (27)4 = (—2i)* = 16, when the
expression 16/4 is encountered it denotes the positive fourth root, namely, 2.
When + is a negative real number and g is an odd positive integer, our earlier definition of b!/4
continues to make sense. We find, for example, that (—8)'/* = —2 since (—2)? = —8 and no other
cube of a real number results in —8. However, for the case where g = 2, the expression (—4)!/?
denotes a complex number that is not real — and so we shall avoid such situations here.
Finally, without getting into a detailed discussion on the development of irrational numbers, we
shall agree that real, but irrational, numbers such as 2!/? = 4/2 and (—5)'/3 = 3/—5 do exist and, in
general, for g €¢ Z* andr € R, the following real numbers also exist:
r/9= Yr, forr>0 r'/4 = Yr, forr <Oandgq odd.
And now that we have settled this issue of exponents (or powers) of the form 1/g, where g isa
positive integer greater than 1, we pass to the following definition.
Definition A1.2 Letb € R and let p, g € Z. Then
1) bP/4 = (b'/4)?, forb > 0:
2) boP/4 = (b'/4)-P = 1/[(b'/4)?], forb > 0;
3) b?/4 = (b'/4)?, for b < 0 and q odd; and
4) boP/4 = (bl/4)-? = 1/[(b'/4)?], for b < O and g odd.
This definition is illustrated in the following example.
a) (8)°/3 = 82/3 = (81/3)? _ 92 =4 (= 641/3 — (87)'/3)
EXAMPLE Al.2
b) (81) = (81/4) 8 = 3-9 = 1/3? = 1/27 (= BY = [BUY P = (81) 47)
e) (—1/32) = [(-1/32)'° = (-1/2)° = -1/8
d) (—1024)~*/> = [(—1024)'/9]-? = (—4)~? = 1/(—4)? = 1/16 (= 1/(—1024)?’).
The last result observed in part (a) of the preceding example suggests the following, which is true
in general:
bY = (bPy'4, b> 0, p.qgeZ.
The other parts of Definition Al.2 can also be extended as
boP/4 = (bP )'/4 = (1 /b?)'/4 = (1/b?/*), b > 0, p, gé Zz.
bP/4 = (bP)/4, bb <0, p,g € Z*, g odd.
b-P/4 = (bP)/4 = (1/b?)/4 = (1/b?/*), b <0, p,q € Z*, g odd.
Appendix 1 Exponential and Logarithmic Functions A-3
Using 2 as our base, we know from Definitions Al.1 and A1.2 that
EXAMPLE AI1.3
2-7 = 1/8, 27° = 1/4, 2-' = 1/2, 2° = 1, 2' = 2, 2 = 4, 27 =8
and that
2-3/2 = (2'/?)-3 = (/2)73 = (1/./2)3 = 1/(2/2) = 0.3535534
23/2 = (./2)3 = 2/2 = 2.8284271(= (23)! = V8).
However, how do we deal with something like 2V3, where now an irrational power confronts us?
Using the fact that /3 = 1.7320508 ..., we can evaluate the successive rational powers:
2'=2
217 = 217/10 = (217)1/10 = 131072!/!9) = 32490096
2!73 = 33172782
2!732 = 33218801
21.7320 = 33218801
21.73205 = 33219952
With the assistance of a hand-held calculator or a computer one finds that to seven decimal places
2¥3 is given as 3.3219971. If we want to be more precise, we can say that the real number 2? is the
limit of the sequence 2', 2'-7, 2'-73, 21-732, 21 7320, 91.73205 | (Qne studies such ideas in calculus and
introductory analysis.)
In a similar way one deals with the expression b”, where b €¢ R* andr eR.
Using the results we have now learned about exponents, we state the following properties — but
we do not prove any of them.
THEOREM Al.1 The Properties of Exponents. For all a, b € R* and all x, y ER,
1) (b*)(b°) = b* bY = bY,
2) (bY /(b*) = bY/bY = be,
3) (b*) = bb’ = b** = (b*)*, and
4) (ab)* = (a*)(b*) =a? - Bb’.
The properties in Theorem A1.1 are illustrated in the following.
1) 35/2. 33/2 = 315/+0/21 = 38/2 = 34 = g]
EXAMPLE Al.4 2) (TMS) /(TVS) = T/A) = 7-19/5 = 7-2 = 1/7? = 1/49
3) [V2 2 = (/2)® = (21/2)6 = 20/28 = 23 = 8
4) (3./5)4 = 34(/5)* = (81)(25) = 2025
We have now finished with the preliminaries needed to define an exponential function.
Definition A1.3 For a fixed positive real number }, the function f: R — R* defined by f(x) = b* is called the
exponential function for base b. [Sometimes we denote b* by exp,(x).]
A-4 Appendix 1 Exponential and Logarithmic Functions
a) In Fig. Al.1 we find the graphs of four functions:
EXAMPLE Al1.5
fi:R->R*t, fix) =x? fp:R-oR*, fo(x) =2*
fiR>R, fp) ax fa RoR’, fax) =3*
The functions f, and f3 are polynomial functions— nor exponential functions. Hence, when
we examine the exponential functions f, and f4 we realize that there is a distinct difference
between the expressions x? (for f;) and 2* (for f)), and between the expressions x° (for f;) and
3* (for f4). The exponential functions f; and f, are such that
1) fox) > Oand f4(x) > 0, forallx € R—inparticular, f(x) > | and f,(x) > 1, forallx > 0,
while0 < fo(x) < l andO < fa(x) < 1, forall x <0.
2) for allx, ye R, x < y> fox) < fp) [and fu(x) < f4(y)]. (This is true for every expo-
nential function where the base > 1. That is, when b > 1 and x < y, then b* < b’,)
3) if vy. w ER and fo(v) = fo(w), then v = w. [This property is also true whenever we are
dealing with an exponential function f(x) = b*, for b> 1. So for v, wéE R and b> 1,
bv =b"”>v=w.)]
f(x)
(3, 8)
> X
4
(f)
(3, 27)
{-3, 8)
(2, 9) (-2, 4)
-—1,2
(—2, -8) -> cre)
| ay
(3) (f) ae
Figure Al.1 Figure A1.2
b) The graph of the function fs: R -> R*, defined by f5(x) = (1/2)* = 2>*, is given in Fig. A1.2.
This graph demonstrates the following properties, which are true for all exponential functions
f:R-> R*, where f(x) = b* for0 <b <1.
1) Here f5(x) > 0 for all x € R—but now we find fs5(x) > 1 for x <0 and fs(x) < 1 when
x > 0.
2) Ifx, yé R withx < y, then f5(x) > fs(y).
3) For x, ye R, if f5(x) = fs(y), thenx = y.
Appendix 1 Exponential and Logarithmic Functions A-5
c) When one speaks of the exponential function the reference is to the function f: R -> R*, where
f(x) = e* for the irrational number e = 2.71828. This function is shown as f¢ in Fig. A1.3,
where we have used the approximations e? = 7.38906 and e* = 20.08554. The function f; (also
in Fig. A1.3) is the exponential function where f7(x) = e*.
es (3, e)
e (2, 7)
(0, 1)
1
T T T
-—3-2-1
(f6)
Figure A1.3
From property (3) in parts (a) and (b) of Example A1.5 we learned that for all b € R* and all
x,y ER, if b 1 and b* = b* then x = y. This observation helps us to solve the following expo-
nential equation.
For which real number(s) 7 is it true that (1/2)~®° = (1/8)7(!9+4)/39
EXAMPLE Al1.6 This equation can be written as 26° = 8'!+4/3 because (1/2)-" = [(1/2)7!]&" = 2° and
(1/8) (0r+4)/3 _— [(1/8) 71] Ue+/3 _— g(l0n+4)/3 | Then
Jon? = gilOn+4)/3 Jn? — (23) (lon+4)/3 = Jon? QUOn+4)
6n? = 10n +45 30? =S5n+25
3n* — 5n —2 = 3n+ 1)(n—2)
=03n = —-1/3 orn =2.
Now that we have examined the exponential function, we shall turn our attention to a second type
of function that goes hand-in-hand with the exponential function. This is the logarithm or logarithmic
function. However, before we introduce this function, we shall review some of the fundamental
properties of logarithms. First we consider the precise relationship between exponents and logarithms,
as described in the following definition.
Definition A1.4 Let b denote a fixed positive real number other than 1. If x € R*, we write log, x to designate the
logarithm of x to the base b (or the logarithm to the base b of x), which is the (unique) real number
y that satisfies b* = x.
This idea can be restated as follows: log, x is the exponent (or power) to which we raise the base
b in order to obtain x. Hence,
y = log, x if and only ifx = b’,
The following results are obtained from the preceding definition:
EXAMPLE A1.7
a) Since 2° = 8, we have log, 8 = 3.
b) One finds that log,(1/81) = —4 because 3-4 = 1/(3*) = 1/81.
A-6 Appendix 1 Exponential and Logarithmic Functions
c) For all b € R*, where b ¥ 1, it follows that
i) log, b= 1 because }! = b,
ii) log, b? = 2 because b* = b?, and
iii) log,(1/b) = -1 because b! = 1/b.
d) Since /7 = 7!/, it follows that log, /7 = 1/2.
Suppose that b, x € R* where + is fixed and different from 1. If log, x = 6, what is log,» x?
EXAMPLE A1.8 We know that log, x = 6 <> b° = x, so x = (b’)?. And x = (b’)? <> log, x = 3. (In a similar
manner one also finds that log,3x = 2 and log,. x = 1.)
In conjunction with properties (1), (2), and (3) for exponents, as found in Theorem A1.1, the
following properties correspond for logarithms.
THEOREM Al.2 Let b, r,s € R* where b is fixed and other than 1. Then
1) log, (rs) = log, r + log, s,
2) log, (r/s) = log, r — log, s, and
3) log, (r*) = s log, r.
Proof: We shall prove part (1) and request a proof for part (2) in the exercises at the end of this
appendix. For part (3) we shall only request (in the exercises) the proof for the case where s is a
nonzero integer— but we shall accept (without proof) and use the general statement given here.
Suppose thatx = log, r and y = log, s. Then, becausex = log, r <> b* =r andy = log, s <<
b* = s, it follows from part (1) of Theorem Al.1 that rs = (b*)(b*) = b**”. Since rs = b*7? <>
log, (rs) = x + y, we have shown that
log, (rs) = x + y = log, r + log, s.
In our next example we find how the three results in Theorem A1.2 can be used to calculate
logarithms.
EXAMPLE A1.9 Before the advent of computers and hand-held calculators, logarithms were used to assist in calculating
. products, quotients, and powers and in extracting roots. Very often the base for these logarithms was 10
and tables of these numbers were available for working with logarithms. [Logarithms were invented
by the Scottish mathematician John Napier (1550-1617). Navigators and astronomers used them in
the seventeenth century to reduce the time it took to perform multiplication and division.]
For example, since log,, 10 = 1 and log,, 100 = 2, one finds that 1 < log,,31 <2. In fact,
log,) 31 = 1.4914. Likewise, we have 2 < log,, 137 = 2.1367 < 3. From Theorem A1.2 it then fol-
lows that
1) logy 4247 = log,)(31 - 137) = log), 31 + logy, 137 = 1.4914 + 2.1367 = 3.6281,
2) log;(137/31) = logy, 137 — log,, 31 = 2.1367 — 1.4914 = 0.6453, and
3) logy) /137 = logy, 137!/3 = (1/3) logy, 137 = (1/3)(2.1367) = 0.7122.
In calculus we find use for logarithms to the base e = 2.71828, and these so-called natural log-
arithms are usually denoted by In x, for x € R*. When dealing with the analysis of algorithms in
computer science, logarithms to the base 2 often prove to be useful. But this does not mean we need
to be overly concerned about dealing with logarithms in several different bases. Many hand-held
calculators provide logarithms to the base 10 and the base e. And we’ll find in our next result that if
we can obtain logarithms in one base, we can use these to obtain logarithms in any other base.
Appendix 1 Exponential and Logarithmic Functions A-7
THEOREM Al1.3 The Base-Changing Formula. Let a, b € R* where neither a nor b is 1. For all x € R*,
]
log, x = bb
log, @
Proof: Let c =log,x and d=log,x. Then b° =x =a‘ and log, x = log, a’ =d log, a=
(log, x)(log, a). Consequently, log, x = log, x/ log, a.
From a table or hand-held calculator one finds that log, 2 = In 2 = 0.6931 and log, 10 = In 10 =
EXAMPLE A1.10 2.3026. Therefore, by virtue of Theorem A1.3, log, 10 = In 10/In 2 = 2.3026/0.6931 = 3.3222.
A special formula results from Theorem A1.3 when x = b. In this case we find that
| EXAMPLE ALI |
log, b= log, 08 Ob _ ]
log, a log, a
Having reviewed the necessary preliminaries, it is time to define the logarithmic function.
Definition A1.5 Let b # 1 bea fixed positive real number. The function g: R* —> R defined by g(x) = log, x is called
the /ogarithmic function to the base b.
a) Consider the logarithmic functions
EXAMPLE AI1.12
gi: Rt > R, gi (x) = log, x go: R* > R, g(x) = log, x.
g(x) 92x)
4
a+ (8, 3) 3 (27,3)
3+ (4, 2) 2
(9, 2)
27 ee
1 + J (2, 1) (1, Q) 1 (3, 1)
t++++++-+ x Ht > x
1+ 2345678 144369 27
9 44(1/2, =1) (1/3, -1)
(1/4, —2) —2
(gy) (g>)
Figure A1.4
The graphs of these functions are shown in Fig. Al.4. These functions are such that
1) gi(x) > 0 and go(x) > 0 for all x > 1, and gi(x) < 0 and g2(x) <0 for all x <1, (This is
true for every logarithmic function log, x where b > 1.)
2) for all x, ye R*, x < y= gi(x) < gify) [and go(x) < g2(y)]. (Again this is true for all
logarithmic functions log, x where b > 1.)
3) ifu, vy € R® and g)(u) = g1(v), then wu = v. (In fact, for b > 1, we have log, u = log, v =>
u = v because w = log, u <> u = b”, and w = log, vv = b".)
b) The graph in Fig. A1.5 is for the function g3: R* —> R defined by g3(x) = log,;., x. This graph
illustrates the following properties, which are true for those logarithmic functions log, x where
O<b< Il.
A-8 Appendix 1 Exponential and Logarithmic Functions
(g3)
Figure A1.5
1) Here g3(x) > 0 for allx < 1, while g3(x) < 0 for all x > 1.
2) For all x, y € Rt, ifx < y then g93(x) > 93(y).
3) If uw, ve R* and g3(u) = g3(v), then u = v. [The proof here is the same as that given in
section (3) of part (a).]
In part (a) of Fig. Al.6 we have the graphs of the functions f: R ~> R*, where f(x) = 2’,
and g: R* -» R, where g(x) = log, x. These graphs are symmetric (to each other) in the line
y = x —that is, if one were to fold the figure along the line y = x, then the graphs of f and g
would coincide. Here we also observe how the points on one graph correspond with the points
on the other. For instance, the point (2, 4) on the graph of f corresponds with the point (4, 2)
on the graph of g. In general, each point (x, 2*) on the graph of f corresponds with the point
(2*, x (= log, 2*)) on the graph of g, and (x, log, x) on the graph of g corresponds with (log, x,
x (= 2'°82*)) on the graph of f.
Figure A1.6
d) The graphs of the functions
h:R->R*, h(x) = (1/2)" k:R™ +R, k(x) = logy) x
are shown in part (b) of Fig. Al.6. As in part (c) of this example these functions are also
symmetric in the line y = x. Here each point (x, (1/2)*) on the graph of f corresponds with
the point ((1/2)", x (= log, ,2,(1/2)")) on the graph of k, and (x, log,;,5) +) on the graph ofk
corresponds with (log, 9) x, x (= (1/ 2)'°8a/2 *)) on the graph of . (These two graphs intersect
on the line y = x where x = 0.6412.)
e) The reader may now want to examine, or reexamine, the graphs of the functions y = e* and
y =Inx shown in Fig. 5.10 of Section 5.6. In that section, the relationship of symmetry of
Appendix 1 Exponential and Logarithmic Functions A-9
functions in the line y = x [mentioned above in parts (c) and (d)] is studied in conjunction with
the ideas of function composition and the inverse of a function.
8. Prove part (2) of Theorem A1.2.
EXERCISES A.1
9, Let b, r © R® where d is fixed and different from 1.
1. Write each of the following in exponential form, for a) For all n € Z*, prove that log, r” =n log, r.
x, yéER*. b) Prove that log, r~™” = (—n) log, r for alln € Z*.
a) /xy3 b) Bix ye) S/8x9y-S 10. Approximate each of the following on the basis that (to four
2. Evaluate each of the following. decimal places) log, 5 = 2.3219 and log, 7 = 2.8074.
a) 125~4/3 b) 0.0277/° c) (4/3)(1/8)
7? a) log, 10 b) log, 100
3. Determine each of the following. ¢) log, (7/5) d) log, 175
3/5 11. Given that (to four decimal places) In 2 = 0.6931, In3 =
a) (5°)(5'°"4) ib) 78/5
! ec) (5!/?)(20'/7) 1.0986, and In 5 = 1.6094, approximate each of the following.
4, In each of the following find the real number(s) x for which a) log, 3 b) log, 2 c) log; 5
the equation is valid.
12. Determine the value of x in each of the following.
a) 530° _— gixt2 b) 4r-1 = (1/2)!
a) logy) 2 + logy) 5 = logy x
ce) (1/25)! = (1/125)" b) log, 3+ log, x = log, 7 — log, 5
§. Write each of the following exponential equations as a log-
13. Solve for x in each of the following.
arithmic equation.
a) logi) x + log, 6 = 1
a) 2’ = 128 b) 125'° =5
b) Inx —In(x — 1) =1n3
c) 10-+ = 1/10,000 d) 2° =b
c) log, (x? + 4x + 4) — log,(2x — 5) =2
6. Find each of the following logarithms.
14, Determine the value of x if
a) log,, 100 b) log), (1/1000)
log, x = (1/3)[log, 3 — log, 5] + (2/3) log, 6 + log, 17.
c) log, 2048 d) log,(1/64)
15. Let b be a fixed positive real number other than 1. If
e) log, 8 f) log, 2
a,c €R*, prove that q!% ° = cle 4,
g) logi, 1 h) log,, 9
7. Solve for x in each of the following.
a) log, 243 =5 b) log, x = —3
c) log,, 1000 = x d) log, 32 = 5/2
Appendix 2
Matrices, Matrix
Operations,
and Determinants
Su": in Chapter 7, and then in several subsequent chapters, certain kinds of matrices have
been introduced. Historically, these mathematical structures were developed and studied in the
nineteenth century by the English mathematician Arthur Cayley (1821-1895) and his (English-born)
American coworker James Joseph Sylvester (1814-1897). Introduced in 1858, Cayley’s work in
matrix algebra provides another instance where research in abstract mathematics later proved to be
of importance in many applied areas — for example, in quantum theory in physics and data analysis
in psychology and sociology.
For those readers who may not have studied anything about matrices in earlier coursework or who
simply wish to review the matrix algebra we use in this text, the material in this appendix should
prove to be helpful. (We shall not prove all of the results in general here but state many of them in
conjunction with a given example. For a more rigorous development the reader should consult one
of the references at the end of this appendix.)
First and foremost, we start with the following.
Definition A2.1 For m,n € Z* anm X n matrix is a rectangular array of mn numbers arranged in m (horizontal) rows
and n (vertical) columns.
Anm Xn matrix A is denoted by A = (4,,)mxn, Where 1 <i <m and 1 < j <n, and the number
a;, is called the (i, j)-entry (that is, the entry that appears in the ith row and jth column of A). An
m X | matrix is often called a column matrix (or column vector); a 1 X n matrix is referred to as a
row matrix (or row vector). When m = n the matrix is called square.
1 2
12 0 3 x 0
Let A = (a,;)3x2 = 2° ; B= thydoxa=| | 2 -1 7 fama =| 7,
EXAMPLE A2.1 vat
Here A is a 3 X 2 matrix where ay, = 1, aj. = 2, ar, = 0,
ay» = 3, a3, = —5, and a3) = 4. The
matrix B has two rows and four columns, where, for instance, one finds the entries bj; = 0 and
bx4 = 7. In the 2 X 2 square matrix C we see that the entries in a matrix may be rational numbers and
even irrational numbers.
(Note: Although the entries in a matrix may even be complex numbers, in this appendix we shall
deal only with matrices where each entry is a real number.)
As with other mathematical structures, once the structure is defined one needs to decide when two
such structures are the same. The method for that decision is now addressed.
A-11
A-12 Appendix 2 Matrices, Matrix Operations, and Determinants
Definition A2.2 Let A = (4,;)mxn and B = (5,,),.x, be two m X n matrices. We say that A and B are equal, and we
write A = B, whena,, = »,, for alll <i <mandall1 <j <n.
In Definition A2.2 we learned that two matrices are equal when they have the same number of rows
EXAMPLE A2.2 and the same number of columns and when their corresponding entries are equal. As a result, if
_j}w 2 0 _|—7 0
Set
a=[% 3 | and a=( 0 |:
nN
then for A and B to be equal we must have w = —7,x = 4, y =2,2z = 3,
Thinking back to our first encounters with arithmetic, after we learned how to count, we then
started to combine integers by using addition, and then multiplication. Along the same lines we now
consider how we may combine matrices.
Definition A2.3 If A = (4,,)mxn and B = (h,,) nxn are two m X n matrices, their sum, denoted A + B, is the m Xn
matrix C = (C,,)mxn, where c,, = a,, + b,,, forall l <i<m,l<j<n.
From Definition A2.3 we see that we can only add two matrices of the same size (where they
have the same number of rows and the same number of columns), Furthermore, the addition of two
matrices is carried out by adding their corresponding entries.
Consider the matrices
EXAMPLE A2.3
1 3 4 2 -1 6 1 -l
A=j;2 0 6], B= | 3 1 74, and C = 3. -4
1 1 3 4 2 2 —7 6
1+2 3+(-l1) 446 3 2 10
Here we find thatA+ B=] 243 041 6+7 i= {5 1 13 |. In fact, we also
1+4 142 342 5 3 5
3 210
have B+A=j; 5 1
13 |, which illustrates the following general result.
5 3 5
For any two m Xn matrices EF and F, E + F = F + E. Hence the addition of matrices is an
example of a commutative (binary) operation.
We cannot determine either of the sums A + C or B + C because each of A, B has three columns
while C has only two. However, we can find the sum
1 -1 1 -1 2 -2
C+C=} 3 -4]/4+] 3 -4]= 6 -8
~7 6 7 6 -14 12
In the last part of Example A2.3, we see that we could have obtained the result C + C by simply
multiplying each entry of C by the number 2. This leads us to the general idea we now state as follows.
Definition A2.4 If A = (4,,)mxn andr € R, the scalar product rA is the m X n matrix where the (7, j)-entry is ra,,,
forall 1 <i<m,1l<j<n.
Appendix 2 Matrices, Matrix Operations, and Determinants A-13
|_ EXAMPLE A2.4 a) wa=|
_
j y
6
5
4
ten
34-31! 6 4]_[3-1 3-6 3-4 _7—3 18 12
~~ Q -1 -3 3-0 3-(-1) 3-(-3)|/ |0 -—3 —~9 |:
b) Fora =| 4
|
3Q 7 fwetina 32 =| ~6
30
|
21? rand
ee)
Ld
by
5a
I
-15
ww
|
_
|
sarmeas(ly 9 6 S]e[3
_ 1
0 5)
4 ~2 2
©
—
=3} 1 © 6/_| ~3 18 18]_[3 18 12] [ -6 0 6
—-5 0 4 -15 0 12 0 -3 -9 -15 3 21
= 3A+ 3B.
¢) The result in part (b) may be generalized as follows: For any twom
Xn matrices E, F and any
reR,r(E+F)=rE-+rF. This principle is called the Distribut
ive Law of Scalar Multipli-
cation over Matrix Addition.
EXAMPLE A2.5 _| a) Let A = (4,,)3x2 represent an arbitrary 3 X 2 matrix, andletZ=|
0 0
—
0 0 |. Then
0 0
a1 a2 0 0 a,+0 @2+0 ay a2
A+Z=] ay ayn 1+} 0 0] =] we +0 ay9 +O |=] a ay | =A,
43, 32 0 0 a3; +0 ay+O0 a3; 32
We say that Z is the additive identity (or zero element) for all 3 X
2 matrices.
1 1 -l -l
b) When A = 2 -—3 | andB=|]| -2 3 |, it follows that
—4 5 4-5
1+(-1) 14(-1) 0 0
A+B=] 2+(-2) (-34+3 ]=][0 0
(-4)+4 5+4(-5) 0 0
Consequently, we call B = (—1)A the additive inverse of A, and also
write B = —A.
Hopefully, what we have done so far has proved to be somewhat interestin
g. But what makes the
Study of matrices truly interesting and useful is the operation of
matrix multiplication. If one tries
to define this operation like the componentwise operation of matrix
addition, the result is of little
interest. Instead, matrix multiplication rests upon a row-and-column
multiplication and summation
where, for example,
by 3
Ia) az a3)] by = a,b; + dob» + a3b3 = Sab.
by 1=1
Hence, in one particular case, we have
2
[-1 4 3]] 1
5
= (-1)-24+4-143-7= 244421 = 23.
A-14 Appendix 2 Matrices, Matrix Operations, and Determinants
In general, if a = (d,)1<,<n is a 1 X n row vector and b = (6,),<,<, is ann X 1 column vector,
then ab = yr a,b,. This result, which is a real number, is called the scalar product of the vectors
(or matrices) a and b. This idea is the key we need for the following definition.
Definition A2.5 Given the matrices A = (@,,)mxn and B= (b,,)nx,, the (matrix) product AB is the matrix C =
(Cik mx ps, Where
Cre = A big + Apdo +++ + indy = So aurbies forall 1 <i<m,1<k<p.
t=1
Hence the entry c,, in the ith row and kth column of the m X p matrix C is obtained from the scalar
product of the ith row (vector) of A and the jth column (vector) of B.
The following demonstrates the result given by Definition A2.5.
a) G2 «7+ Gin by by
a2) 422, +++ Ady bo, bx
Git Gig tte hin
am Am? cee Amn bn} bn2
Cu Cy2 tt CR ttt Lp
C21 C220 + Ck "rt €2p
C=
Cy] C12 ve Cik ore Cip
Cm Cm2 vt Omk ue Cp
~~)
1 2 1
EXAMPLE A2.6
Wh
o--
a) Consider the matrices A = (4,,)2x3 = 3 0 4 and B= (b,x )3x3 =
Ww
me
re
Ci C12 C13
Then AB = C = (¢,4)2x3 = | ° , where
€2) -€22,— C23
emt ben
NN
Wo ee
cy =1-14+2-14+1-0=3
ae
W
1
un
—
of
bo
-
|
a |
el
c2 =1-242-34+1-1=9
oe
wet tO
rr
or
Lo
|
1
mY
do
|
|
C3 =1-7+2-34+1-1=14
fs mee
ee
orn
ns
tors Ld
GW
Lo
Ke
|
Ll
|
\
j
NO
+]
—
|
|
io ee
—
eC, =3-14+0-144-0=3
—WwW
tw
fe
tf
oS
Appendix 2 Matrices, Matrix Operations, and Determinants A-15
12 4 1 2 7
O 1 1
1 2 1 27
63 =3-74+0-34+4-1=25 3 0 ‘| 1 3 3
1 }
Consequently,
—_,_{/3 9 14
ap=c=[5 10 33 |:
1 2 7
b) With A and B as in part (a) let us try to form the matrix producttBA=|1 3 3 ; , i:
0 1 1
To find the entry in the first row and first column of BA we want to form the scalar product
[1 2 rif 5 ]ateaserer.
Unfortunately, we do not have enough entries in the first column of A, and so we cannot form
either this scalar product or the matrix product BA.
Now we may find ourselves wondering why we could form the product AB but couldn’t
form the product BA. Considering the product BA once again, we see that the difficulty hinges
on the fact that the first column of A did not have the same number of entries as the first row
of B. The number of entries in the first row of B is 3, which is the number of columns in B.
The number of entries in the first column of A is 2, which is the number of rows in A. These
observations lead us to the following general result.
If C isan m X n matrix and D is a p X g matrix, then the product C D can be formed when
n = p—that is, when the number of columns in C (the first matrix) equals the number of rows
in D (the second matrix). And when n = p the resulting product C D has m rows and g columns.
Let us examine matrix multiplication a little further.
12 1 1 2 5 1 6 4 7
EXAMPLE A2.7 a)IfA=|1 to 3 and B=] 2 0 2
|. ten 48 = 731 1 8
4
white BA = | 4 6
|
Consequently, even though it is possible to form both matrix products AB and BA, we do not
have AB = BA. In fact, these products are not even of the same size.
—1 I 1 2 . _10 0 _|1 =!
b) For =| \ -j [ana e =| 5 frome finds that 4B = | j p jand Ba = | “1 |
So here AB and BA are of the same size, but AB # BA.
c) Finally, consider the matrices
2 ]
1 1 3 1 2
a=[4 4 5 |: B= (0 1 |, and c=| |:
Here we find that
1
ae-[)
fi
1 13 3]f 0 1 _ff-[7 a]
-1
om
A-16 Appendix 2 Matrices, Matrix Operations, and Determinants
_f—7 -1 1 2]_[-10 -10
caByc =|“) IE {=| a3 war
while
2 ] ) 5 0
BC = 0 ] 3 i = 3 —4 and
—3 —6 —2
5 0
fi 13 _f—-10 —10
aBe) =| | -1 ; | - =| 45 "|:
Hence, (AB)C = A(BC).
In general, if m,n, p,q € Z* and A = (4,,)mxn, B= (bj nxp, and C = (Cy) pxq, then
(AB)C = A(BC),
so matrix multiplication is associative (when it can be performed).
From the results in parts (a) and (b) of Example A2.7 we learn two important facts:
1) The operation of matrix multiplication is not commutative in general.
2) Itis possible to find two nonzero matrices C = (¢,;)mxn(C., # Oforsome 1 <i <m,1< j <n)
and D = (d,x)nxp, (dy, # O for some 1 < j <n, 1 <k < p), where CD = Z = (0),nx>.
In short, matrix multiplication does not necessarily behave like the multiplication of real numbers.
Now that we’ ve made some comparisons between matrix multiplication and the multiplication of
real numbers, let us pursue a few more.
a) When we consider square matrices — in particular, 2 X 2 matrices — we learn that
EXAMPLE A2.8
a b 1 O _ 1 O a b _|a b
c uh(Ud 0 1 0 1 c ad c dl’
Consequently, the matrix 7) = | ; , is called the multiplicative identity for all 2 X 2 matri-
ces. In general, for a fixed positive integer n > 1, the matrix
_ fd, ifi = j
Ln ~ (51) nxn where 8, ~~ | 0, if i # j
is the multiplicative identity for all n X n matrices.
b Returning to the real numbers let us recall that for each x € R, if x # 0, then there exists y ER
—
where xy = yx = 1. This real number y is termed the multiplicative inverse of x and is often
designated by x~!.
We would like to know if there is a similar situation for square matrices
— and we shall
concentrate on 2 X 2 matrices.
where, b, c, d are fixed real numbers, can we finda matrix B = . x
IfA = |< 2
7
a
so that AB = BA = 1)? (Here w, x, y, z are unknown real numbers and our objective is to
determine the values of these four numbers in terms of the given real numbers a, b, c, d.)
Forming the product AB we find that
a b w x aw+by ax+hz
AB = = .
cod y 2 cw+dy cx+dz
Appendix 2 Matrices, Matrix Operations, and Determinants A-17
For AB to equal /, —that is, for
| ete ede |
cwtdy cx+dz 0 1
— it follows from the definition of equality of matrices that
(1) aw+by=1 (3) ax+bz=0
(2) cw+dy=0 (4) cx+dz=1.
Focusing on Egs. (1) and (2), if we multiply Eq. (1) by d and Eq. (2) by b, we find that
(1)’ adw+hdy =d (2)' bew + bdy = 0.
Subtracting Eq. (2)’ from Eq. (1)’, we learn that adw — bew = (ad — bc)w = d, so
w=d/(ad —bc), if ad—bce #0. Similar calculations yield x = —b/(ad — bc), y=
—c/(ad — bc), z = a/(ad — bc), and these formulas are also valid as long as ad — be # 0.
[Note: (1) The real number ad — bc is called the determinant of the matrix A. (2) Although
we determined the values for w, x, y, and z from the equation AB = Jy, it can be shown that
the same solutions result when we deal with the equation BA = /).]
o[e }-[ 7
c) Using the results in part (b), let
Then with ad— bc = 1-1—2-0=1 (0), it follows that w = 1/1 = 1, x = —2/1 = —2,
y = —0/1 = 0,z= 1/1 = 1, and
fo i]lo a}-Lo t}-[o T]lo af
Under these circumstances we write A~! = ) ~ |
3
d) Consider the matrix A; = , where the determinant of A; =3-2-—1-1=5 (#0).
1 2
“1 3 1yt_ 2/5 -1/5 | _ 2 -l
Here we find that A, F >| | is 3/5 1/5 1 3 |:
1 _
e) From parts (b), (c), and (d), we can say thatifA = |< ’ J ten A= det(A) | “ ’ |
when det(A) = determinant of A = ad — bc # 0.
1 2
f) For the matrix A) = | I one finds that the determinant of A, = 1-6 —2-3 =0, soin
3. 6
this case there is no multiplicative inverse — that is, Ay' does not exist.
At this point we have developed some fundamental ideas about matrices, and the reader may be
wondering how one might use these mathematical structures. Therefore we return one more time to
the real numbers and some of the ideas we encountered in elementary algebra.
When the equation 2x = 3 is solved the following list of equations may be written:
2x =3 (1)
(5) 2x) = (3) GB) (2)
[(5) 2]x = 3/2 (3)
l-x = 3/2 (4)
x = 3/2 (5)
A-18 Appendix 2 Matrices, Matrix Operations, and Determinants
And in solving this equation the real number 1/2 (= 2-'), introduced in Eq. (2), is what we need to
“get the unknown, x, by itself” as we progress through steps (3) and (4) and get to step (5). So, in
general, if we start with the fixed real numbers a, b, where a # 0, then the equation ax = b has the
solution x = a~'b.
Now let us consider the system of linear equations:
Bx +y=3 (*)
x+2y =7,
which can be represented in matrix form as
[i als }-[7}
[This way of representing a system of linear equations is helpful in understanding the reason behind
the definition of matrix multiplication. For the left-hand side of each equation at (*) is the scalar
Xx
product of a row from the matrix ; ; with the column matrix | If we let
[th ef} = Bi
y
v
then we are seeking a solution for the (matrix) equation AX = B. Could the solution here be X =
A~'B, considering that it was x = a~'b for the earlier equation ax = b?
Since the determinant of A = 3-2 —1-1=5 #0, from part (e) of Example A2.8 we know that
a 2 -1)_ 2/5 —-1/5
=a | 5 fo [ als ss |:
Then we find that
Eri }-b] f
3 1 x 3 ,
2/5 —I1/5 3 1 x 2/5 —1/5 3 2)!
—1/5 3/5 1 2 y —1/5 3/5 7 (
2/5 —-I1/5 3 1 x |_| -l/5 3)
ys 3/5 |}1 24) | y] >| 18/5 (3)
Co LS }-Liss| 7
1 0 x |_| -1/5 ,
Ls 1-Lasis | 6
x |_| —I/5 '
From Definition A2.2 it then follows from the solution |; =X=A'!B= | ~ 1/5 | that
18/5
x = —1/S5and y = 18/5.
In general, if A = fu @2 | and B= P| with 11, 412, 421, 422, b,, by € Rand det(A) =
42, 422 by
11@o2 — 422, # O, then the solution of the system of linear equations,
ax tay = db,
ao\x + dny = hn,
Appendix 2 Matrices, Matrix Operations, and Determinants A-19
is given by
x=-|* |aa7pe | a2. a2 by |__| (1/ det(A))(a22b1 — ay2b2)
y det(A) | -—42 ay bo (1/ det(A))(-aaib; + ayib2) |
Furthermore, although we cannot prove our next result, the following is true for n € Z*, n > 2.
If A = (aj;)nxn is a real matrix (which has a multliplicative inverse), and B = (b,)j<<n, X =
(x, )i<:<n aren X 1 column matrices (like those defined earlier for n = 2), then the resulting system
of linear equations
AX =B
has the solution
X=A'B.
Now although we shall not deal with the inverses of any matrices larger than 2 X 2, we close this
appendix with some further results on larger determinants.
We already know that for A = od
ab | the determinant of A = det(A) = ad — bc. The det(A)
is generally denoted by | : d In order to deal with the determinants of larger matrices we need
the following idea.
Definition A2.6 Let A = (@i))nxn with n > 3. For all 1 <i <n and | < j <n, the minor associated with a,, is the
(n — 1) X (n — 1) determinant obtained from matrix A after we delete the ith row and /th column
of A.
1 0 2
EXAMPLE A2.9 a) For A = 3 4 6 |, we find that
-1 3 7
1) the minor associated with 0 is obtained from A by deleting its first row and second column:
10 2
3. 4 6 leads us to | 3 ° >; and
-1 3 7
_ : P ail a\2 ] 0
2) for a2, = 6 the minor is ay dn | | 1 3
b) Given the 4 X 4 matrix
4 -3 2
6 9 4 0
the minor associated with 3 is the 3 X 3 determinant
2 0 6
—3 -—2 5},
9 4 0
obtained from the matrix B by deleting the second row and first column of B (and replacing
the matrix brackets by the vertical bars for determinants).
A-20 Appendix 2 Matrices, Matrix Operations, and Determinants
Given a matrix A = (4,;)3x3, for all 1 <i <3, 1 < j <3, we shall let M;, denote the minor
associated with a,,. Then
a a2 443
det(A) =] 42 ax a3 | = an (-1) My + an(-)D!P
Mp + 433(-D!
M3
43) 432 433
a2. 4x3 a2, 423 a2; a2
=a — a)2 + a43
432 33 a3, 433 a3, 432
and we say that we are evaluating det(A) by using an expansion by minors.
In this way we reduce the problem to 2 X 2 determinants that we know how to evaluate. Let us
examine a particular example.
EXAMPLEA2I0 | ay 247
3 8 2/= 2-0" |8F 2S|ea-mre|> 2gf+encn|> 3 58
5 6 0 > 6
=2(8.0-—2-6)-—43-0-—2-5)-—7(5
:6-8.-5)
= 2(-12) — 4(—10) — 7(—22) = 170.
[Note: In this expansion by minors we find a sum that uses each entry a,,, for 1 < / < 3, in the
first row of the determinant, and each such entry is multiplied by two terms:
1) (—1)'*, where the exponent | + j is the sum of the row number and column number for
a,,; and
2) its associated minor M, ;.]
b) The reader may be wondering what is so special about the first row of a determinant. For suppose
we expand the determinant in part (a) by the third column. The resulting expansion is
3
2 4 2 4
So a3(-1)'3M3 = (-I(-)'8 3.8
5 6
| + 2(- 1)?+3
5 6 [+09 3 8
1=1
= (-7)(3 6-8-5) —2(2:6-—4-5)
+ 0(2.8 —4-3)
= (—7)(—22)
— 2(—8) = 170.
c) What has happened in parts (a) and (b) is not just a mere conicidence. In general, for any 3 X 3
matrix A, the determinant of A can be evaluated by expanding along any one (fixed) row or
down any one (fixed) column. And this method extends to larger square matrices — that is, for
ne&Z* where n > 4, ann X n deteminant can be expanded, along any one of its n rows or
down any one of its n columns, into n summands each of which involves an (n — 1) X (n — 1)
determinant.
If A = (4,;)nxn, where n > 3, then
det(A) = - a,,(—1)'*) M,, [expansion across the (fixed) 7th row]
j=l
= a,,(—1)'*!
J M, J [expansion
P down the (fixed) jth column].
1=1
d) From part (c) we now realize that if A = (4,,)nxn, for any n > 3, then if A has a row or column
where every entry is 0, it follows that the determinant of A is 0.
Appendix 2 Matrices, Matrix Operations, and Determinants A-21
REFERENCES
The ideas presented in this appendix (and its corresponding exercises) should provide a sufficient
background for what is needed in the way of matrices and determinants in this text. For the reader
who would like to learn more about this area of mathematics, any one of the following should serve
as a good starting point.
1. Anton, Howard, and Rorres, Chris. Elementary Linear Algebra with Applications. New York:
Wiley, 1987.
2. Lay, David C. Linear Algebra and Its Applications, 3rd ed. Boston Mass.: Addison-Wesley,
2003.
3. Strang, Gilbert. Linear Algebra and Its Applications, 3rd ed. San Diego, Calif.: Harcourt Brace
Jovanovich, 1988.
-1 4
EXERCISES A.2 4, Let A= 1 2], B= 7 4 | and C=
a 13 5
tet a=| 3} e-[i I | and C=
0 3 1 2 4 5 3 a | Show that (a) AB+ AC = A(B +C);
| oo!
5
a) A+B
4
3 | Find each of the following.
b) (A+ B)4+C
and (b) BA+CA=(B+4+C)A.
[In general, ifA isan m X n matrix and B, C aren X pma-
trices, then AB + AC = A(B+C). Forn X p matrices B, C
ec) B+C d) A+(B+C) and a p X q matrix A, it follows that BA +CA=(B+C)A.
e) 2A f) 2A4+3B These two results are called the Distributive Laws for Matrix
Multiplication over Matrix Addition.]
g) 2C +3C h) SC (= (2+. 3)C)
5. Find the multiplicative inverse of each of the following
i) 2B —4C (= 2B + (-4)C) matrices if the multiplicative inverse exists.
of i » LT o|
j) A+2B-3C 1 2 0 1
k) 2(3B) 1) (2-3)B
olsa} ef a
2. Solve for a, b, c, d if —3 1 7 -3
fe ale[s S]-2[s 3] 6. Solve each of the following matrix equations for the 2 x 2
3. Perform the following matrix multiplications. matrix A.
a)[1 3 7]
—2
0 e[2 s}e=[i 3]
b) || ) 3 |
2
1
2 5
4 mE sde-[p t]-Li 7]
3. 6 1 1 _{|-1 2 .
7. it a=| >| and e=| 7) a determine the
) 1 —2 2 6 following.
“lo 3116 8
a) A! b) Bo! c) AB
r 1 1 —-l 3 0 4
d) (AB)"! e) B-'A7!
d)} 2 -2 3 —-1 0 6
| 4 0 —5 7 7 2 8. Evaluate the following 2 < 2 determinants:
r1 0 0 a bee 1 2 5 10
e)} 0 1 O de f Ml3 4 D3 4
| 0 0 3 g h i 5 2 5 10
ri 0 0 ah e¢ Olis 4 | qd) | 15 20
f); 0 0 3 de 9. Solve the following systems of linear equations by using
| Oo 1 O gohii matrices:
A-22 Appendix 2 Matrices, Matrix Operations, and Determinants
a) 3x —2y =5 b) 5x + 3y = 35 b) State a general result suggested by the answers in
4x —3y =6 3x —2y =2 part (a).
b . 15. a) Evaluate each of the following 3 X 3 determinants.
10. Leta, b,c, d€
R with | é | = 7. Determine the value
c ad 1 2 1
of each of the following. i) |O -1 -!I
3a 3b 3a b 2 3 0
DV) b) | 3c d | 5 2 1
a b 3a 3b ii) 0 -1 -!l
V1 30 3a Ml 3. 3a | 10 3 0
11. Let A be a 2X2 matrix with det(A) = 31. What is
5 2 5
det(2A)? What is det(SA)?
iii) 0 -1 —5
12. Expand each of the following determinants across the spec- 10 3 0
ified row as well as down the specified column.
ab e¢
1 0 -2
b) Leta, b,c,d,e, fig, hieR If|d e f |}=17,
a)/}3 1 —1 |; row 2 andcolumn 3 gh i
4 | 2 evaluate
1 1 2 3a b ¢
b) | 2 3 —4 |; row 1 and column 2 ) | 3d e f
Q 5 7 3g ho i
13. Expand each of the following determinants across any row 3a be
or down any column. ii) | 9d 3e 6f
1 0 2 4 7 0 1 2 -4 3g hh 2
a)
|6 —2 1} bb} 4 2 O c)/0 1 0
4 2a 2b 2c
3 2 3.6 2 3 3 2
iii) | 3d 3¢e 3f
14, a) Evaluate each of the following 3 X 3 determinants. Se Sh Si
1 2 ada 16. Let A = (4,))nx, and B = (b,,)nx, be two matrices. When
i) 1 3 li) |b e b the matrix product AB is formed, as defined in Definition A2.5,
1 4 c f ¢ how many multiplications (of entries) are performed? How
many additions (of entry-products) are performed?
2 4 de f
iii) 3. -l iv) |a bie
2 4 ab ec
Appendix 3
Countable and
Uncountable Sets
I: Example 3.2 of Section 3.1 we informally mention the ideas of what we feel are a finite set and
an infinite set. This final appendix wil! deal with these issues in a more rigorous manner and will
help us attach some meaning to |A| (the size, or cardinality, of a set A) when A is an infinite set. To
develop these notions more precisely let us recall the following concept that was first introduced in
Section 5.6.
Definition A3.1 For any nonempty sets A, B the function f: A —> B is called a one-to-one correspondence if f is
both one-to-one and onto.
Let A=Z* and B=2Z* = {2k|k € Z*} = {2,4,6,...}. The function f: A-» B, defined by
EXAMPLE A3.1 f(x) = 2x, is a one-to-one correspondence:
1) For a), a2 € A, we have f(a,) = f (a2) => 2a, = 2a) => a, = a2, so f is one-to-one.
2) Ifb € B, then b = 2a for some (unique) a € A, and f(a) = 2a = b, making f onto.
The result in Example A3.1 now leads us to consider the following.
Definition A3.2 If A, B are two nonempty sets, we say that A has the same size, or cardinality, as B and we write
A ~ B, if there exists a one-to-one correspondence f: A > B.
From Example A3.1 we see that Z* has the same size as 2Z*, even though it seems that 2Z~ has
fewer elements than Z* — after all, we do know that 277C Z*.
If we define g: B —» A (for B = 2Z* and A = Z*) by g(2k) = k, then
1) g(2k,) = g(2kz) => ky = kp => 2k, = 2k, establishing that g is one-to-one; and
2) for each k € A, we have 2k € B with g(2k) = k, so g is also an onto function.
Consequently, g is a one-to-one correspondence and B ~ A.
So at least in the case of A = Z* and B = 2Z* we find that A ~~ B and B ~ A (even though
B c A). But, in reality, what has happened in this one situation holds true in general. For the function
g just defined is actually the function f—! for f in Example A3.1. And we learned in Theorem 5.8
that a function is invertible if and only if it is both one-to-one and onto. Consequently, whenever there
are two nonempty sets A, B with A ~ B then it follows from Theorem 5.8 that B ~ A, so we can say
that A and B have the same cardinality and denote this by |A| = |B]. (Note: It does not necessarily
follow that A = B.)
Let us consider another example.
A-23
A-24 Appendix 3 Countable and Uncountable Sets
For B = 2Z+ = {2k|k€Z*} and C = 3Z* = {3k|k € Z*}, the function 4: B > C defined by
EXAMPLE A3.2 h(2k) = 3k establishes a one-to-one correspondence between B and C. Therefore we have B ~ C
(and C ~ B, and |B| = |C|). Furthermore, using the function f: A > B that was defined in Example
A3.1, where A = Z*, by virtue of Theorem 5.5 we know that h o f: A -» C is also a one-to-one
correspondence. So A ~ C (and C ~ A, and |A| = |C]).
What we have learned up to this point can be summarized as part of the following result.
THEOREM A3.1 For all nonempty sets A, B, C,
a) A~A;
b) if A~ B, then B ~ A; and
ec) ifA~BandB~C,thenA~C.
Proof:
a) Given any nonempty set A, it follows that A ~ A because the identity function 14: A— Aisa
one-to-one correspondence.
b) If A ~~ B, then there exists a one-to-one correspondence f: A ~> B. But then f~': B -> A is
also a one-to-one correspondence and we have B ~ A.
c) When A ~ B and B ~C there exist one-to-one correspondences f: A—> B and g:B>C.
Since g o f: A > C is also a one-to-one correspondence, it follows that A ~~ C.
We shall now use the ideas developed so far in order to define what we shall mean by a finite set
and by an infinite set.
Definition A3.3 Any set A is called a finite set ifA = @ or if A ~ {1, 2, 3,...,} forsomen € Z*. When A = # we
say that A has no elements and write |A| = 0. In the latter case A is said to have n elements and we
write |A| = n. When a set A is not finite then it is called infinite.
From this definition we see that if A is a nonempty finite set then there is a one-to-one correspon-
dence g: {1, 2,3,...,n}— A forsomen € Z*. This function g provides a listing of the elements of
Aas g(1), g(2),..., g(n) —a listing where we can count (or account for) a first element, a second
element, ..., and so on, up to an nth (last) element.
Also when A is an infinite set we see that there is no n € Z* for which we can find a one-to-
one correspondence f: A — {1, 2,3,...,}. But if A, B are both infinite sets, can we conclude
automatically that |A| = | B| —that is, that there is a one-to-one correspondence between A and B?
This is the question we shall answer, in the negative, as we continue our discussion. For now we
introduce the following concept.
Definition A3.4 Aset A is called countable (or denumberable) if (1) A is finite or (2) A~ Z*.
We have seen that 27+ ~ Z* and 3Z* ~ Z*, andsince Z* ~ Z", it follows that the sets Z*, 2Z*,
and 3Z* are all countable sets. In fact, for all k € Z, k #0, the function f: Z* -> kZ*, defined by
f (x) = kx, is a one-to-one correspondence so kZ* is countable (and |kZ*| = |Z*|). Consequently,
the set of all negative integers
— that is, (—1)Z* — is a countable set.
Furthermore, whenever A is infinite and A ~ Z*, we also have Z* ~ A, so there is a one-to-one
correspondence f:Z* — A which provides a listing of the elements of A—namely, f(1), f(2),
f@Q), ...—and in this way we can count (but never finish counting) the elements in A.
Appendix 3. Countable and Uncountable Sets A-25
Finally, as noted above, whenever A ~ Z* we have Z* ~ A. Consequently, a given set A can be
shown to be countably infinite (that is, both infinite and countable) by finding either a one-to-one
correspondence f: A > Z* or a one-to-one correspondence g: Z* > A.
Since Z*, (—1)Z*, and {0} are all countable, is Z = Z* U (—1)Z* U {0} countable?
EXAMPLE A3.3 Consider the function f: Z* —> Z defined by
_ f x/2, forx even
FO) = | —(x—1)/2, for x odd.
Here we find, for example, that
f(4) =4/2 =2 and fGB) = -G-1)/2
= -2/2 = -1.
We claim that f is a one-to-one correspondence where f(2Z*) = Z* and f(Z*t —2Z*) =
(—1)Z* U {0}. For suppose that a, b € Z* with f(a) = f(b).
1) Ifa, b are both even, then f(a) = f(b) > a/2 = b/2 >a =b.
2) If a, b are both odd, then f(a) = f(b) => -(a — 1)/2 = -(b-1)/2
> a-1=b-1>
a=b.
3) Ifa is even and b odd, then f(a) = f(b)> a/2 = -—(b-1)/25a=-b4+1>a-1=
—b, with a —1> 1 and —b < 0. Hence this case cannot occur
— nor can the case where a is
odd and even.
Consequently, the function f is at least one-to-one.
Furthermore, for all y € Z,
1) if y = 0, then f(1) = 0;
2) ify > 0, then 2y € Z* and f(2y) = 2y/2 = y; and
3) ify <0, then —2y + 1 © Z* and f(—2y + 1) = —[(-2y 4+ 1) — 1]/2 = —(-2y)/2 = y.
So f is alsoan onto function and f: Z* — Zisaone-to-one correspondence. Hence Z is countable.
Although all of our examples of countably infinite sets have been subsets of Z, other countably
infinite sets are possible.
EXAMPLE A3.4_| Let A = {1, 1/2, 1/3, 1/4, ...} = {1/n|n € Z*}. The function f: Z* -» A defined by f(n) = 1/n
establishes a one-to-one correspondence between Z* and A. Hence |Z*| = |A| and A is countable.
In order to take our development on countable sets one step further we now introduce the following
definition.
Definition A3.5 For n € Z*, a finite sequence ofn terms is a function f whose domain is {1, 2,3,..., 2}. Sucha
sequence is usually written as an ordered set {x|, X2, %3,-.., Xn}, where x, = f(z) forall 1 <i <n.
An infinite sequence is a function g having Z* as its domain. This type of sequence is generally
denoted by the ordered set {x,},<z+ or {X1, X2, X3,...}, where x, = g(i) foralli e Z*.
a) The set {1, 1/2, 1/4, 1/8, 1/16} can be thought of as a finite sequence — given by the function
EXAMPLE A3.5 f: A— Q where A = {1, 2,3, 4, 5} and f(n) = 27-""".
b) The set A in Example A3. 4 can also be expressed as {1/n},,-<z+ —an infinite sequence given
by the function g: Zt — Q*, where g(n) = 1/n for eachne Z*.
A-26 Appendix 3 Countable and Uncountable Sets
c) The terms in a sequence need not be distinct. For instance, let f: Z* — Z, where x, = f(n) =
(—1)"*!, for each positive integer n. Then {x,}nez+ = {X1, X2, X3, Xa, X5,...} = (1, -1, 1, -1,
1,...}, but the range of f is only the two-element set {1, —1}.
Our next result ties together the concepts introduced in Definitions A3.4 and A3.5.
THEOREM A3.2 If A is a nonempty countable set, then A can be written as a sequence of distinct elements.
Proof: There are two cases to consider.
1) If A is finite, then A ~ {1, 2, 3,...n} (and {1, 2,3,...,}~ A) for some n € Z*. Hence
there is a one-to-one correspondence f:{1,2,3,...,n}— A.
Define a, = f (i) for each 1 <i <n. Then, since f is one-to-one and onto,
{a, do, 43, ..., G,} iS a Sequence of the n distinct elements of A.
2) For A infinite there is a one-to-one correspondence g: Z* —> A.
Define a, = g(i) for all i € Z*. Since g is one-to-one, the elements of the infinite sequence
{@|, do, a3, ... .} are distinct; {a,, a2, a3, ...} = A because g is onto.
Before moving forward let us retrace some of our steps and recall that Z* is countable as are the
subsets 2Z* and 3Z* (of Z*). This suggests that perhaps every subset of a countable set is itself
countable. To deal with this possibility we introduce the next two ideas.
Definition A3.6 1) The infinite sequence {a), a2, a3,...} = {a,}iez+ is a subsequence of Z* = {1,2,3,...}
if for alli ¢ Z*, a; © Z* anda, <a,4).
2) Let {xn}nezt+ and {yn}nez+ be two infinite sequences. We say that {y,}nez+ iS a subsequence of
{Xn}nez+ if there exists a subsequence {a;},<7+ of Z* where for eachk € Z* we have y, = x,,.
a) {1,3,5,7,...}isasubsequence of Z*, as is {1, 2, 4, 7, 11, 16, .. .}. The first subsequence can
EXAMPLE A3.6 be given by the function f: Z* — Z* where a, = f(n) = 2n — 1. The second subsequence can
be generated recursively by
1) c, = hQ1) = 1; and
2) Cngt =A(N +1) =h(n) +n =c, +7, forn > 1.
b) Let {x,},cz+ and {yn}nez+ be two sequences where for each n € Z*, x, = f(n) = (—1)" +
(1/n) and y, = g(r) =1+(1/Q2n)). So {xn)nez+ = {0, 3/2, -2/3, 5/4, -4/5, 7/6, 6/7,
9/8, ...}—and {vabnez+ = (3/2, 5/4, 7/6, 9/8, ...} —and y, = X92, for all n € Z*. For the
subsequence {aj },ez+ (of Z*) where a, = 2k for each k € Z*, we find that y, = x2, = X,, for
each n € Z* —and this shows us that {y,},ez+ is a subsequence of {x,},ez+.
c) For né Z* let x, = 1/n and let y, = 1/(3n). Then {x,},ez+ = {1, 1/2, 1/3, 1/4, 1/5, 1/6,
1/7,...} and {yn}nez+ = {1/3, 1/6, 1/9, ...}. Now consider the subsequence {a;},<z+ (of Z*)
where a, = 3k for each k € Z*. Then for all n € Z*, y, = 1/(3n) = X3n = Xa,,80 {Vn }nezt+ isa
subsequence of {x,}nez+-
And now we turn to the following result for countable sets and their subsets.
THEOREM A3.3 If S is an infinite countable set and A < S, then A is countable.
Proof: If A is finite, then from Definition A3.4 we know that A is countable. So assume from this
point on that A is infinite. Since S$ is countable, we can invoke Theorem A3.2 in order to list the
Appendix 3 Countable and Uncountable Sets A-27
elements of S as an infinite sequence of distinct terms — so we write S = {5), 52, 53, . . .}. Now define
a subsequence {a,},¢z+ of Z* as follows:
a, = min{n|n € Z*, ands, € A}
dy = min{n|n € Z*,n > a, ands, € A}
a3 = min{n|n € Z*,n > a. ands, € A}
In general, once a), a2, a3,..., G, have been selected, we define a,4, = min{n|n € Z*, n > a,
and s, € A}. Consider the “function” F: Z* — A given by F(n) = s,,. If m,n € Z*, we find that
M = N => On = An => San = Sa, => F(m) = F(n), so there is no doubt that F is a function. To com-
plete the proof that A is countable we need to show that F is a one-to-one correspondence.
Suppose that m, n € Z* with F(m) = F(n).Then F(m) = F(n) = Sq, = Sa, => Gm = Gy because
the elements of the sequence S = {s), 52, 53, .. .} are distinct. Furthermore,
a,, = d, => m = n because
the elements in the subsequence {a,}n-z+ of Z* are also distinct. Consequently, this function F is
one-to-one.
Now let b € A. Since A C S$ = {s), 9, 53. ..} we can write b = s,, for some m € Z*. If m = a,
then F(1) = su, = Sn = b. Ifm # ay, then since a, < a2 < a3 <---, there isa smallest r € Z* such
that a,_, <_m <a,.From the definition of the subsequence {a,,},<z+ we know that a, = min{t|t € Z*,
t > a,_, ands, € A} —and sincem > a,_, and s,, € A, we havea, < m. Nowa, <mandm<a,=>
a, =m, and so F(r) = sy, = S_ = b. Consequently, the function F is also onto.
From Theorem A3.3 we deduce that a given infinite set 7 is countable if and only if 7 has the
same cardinality as a subset of Z*. So if there is a one-to-one function f: T ~> Z* (not necessarily
a one-to-one correspondence), then this is enough to tell us that T is countable — for T ~ f(T) (or
|T| = | f(P)|) and f(T) is countable.
Up to this point every infinite set we have examined has turned out to be countable. Could it be
that all infinite sets are countable— and that for all infinite sets A, B we have |A| = |B|? The next
result settles this issue.
THEOREM A3.4 The set (0, 1] = {x|x € R and 0 < x < 1} is not a countable set.
Proof: If (0, 1] were countable, then (by Theorem A3.2) we could write this set as a sequence of
distinct terms: (0, 1] = {71, r2, 73, .. .}. To avoid two representations we agree to write real numbers
in (0, 1] such as 0.5 as 0.499 ...— so no element in (0, 1] is represented by a decimal expansion that
terminates. Writing such decimal expansions for 7), r2, 73, ..., we get
ry = 0.4,14)24)34 14° -
ry = Q.d1 472423024 °°
13 = 0.431432€33434 °-*
ln 0.4 n1Gn2Gn34n4 see
where a,; € {0, 1, 2, 3,..., 8, 9} foralli,j ¢€ Z..
Now consider the real number r = 0.4,b2b3 --- , where for each k € Z”,
h, = 3, if Akk x 3
« 7, if Gk = 3,
Then r € (0, 1), but for every k € Z* we have r 4 r, —sor ¢ {r}, 72,73, 74, ...}. This contradicts
our assumption that (0, 1] = {r), 72, 73, rq, ..-}.
A-28 Appendix 3 Countable and Uncountable Sets
The technique employed in this proof (of Theorem A3.4) is generally known as Cantor's Diagonal
Construction in honor of the (Russian-born) German mathematician Georg Cantor (1845-1918), who
introduced the idea in December of 1873.
When a set is not countable it is termed uncountable. So (0, 1] is uncountable. When a set A is
uncountable then (1) Z* and A do not have the same size, or cardinality, so Z* ~ A and the cardinality
of A is greater than that of Z* — that is, |A| > |Z*|, even though both A and Z* are infinite sets.
The following corollary provides another example of an uncountable set.
COROLLARY A3.1 The set R (of all real numbers) is an uncountable set.
Proof: If R were countable, then by Theorem A3.3 the subset (0, 1] of R would be countable.
Before continuing with anything new let us say a few more words about this notion of an uncount-
able set.
1) First and foremost we realize that Corollary A3.1 is a special case of the general result: For all
sets A, B,if A is uncountable and A C B, then B is uncountable.
2) Unlike the result in Theorem A3.3 we do not find in general that nonempty subsets of uncount-
able sets are uncountable. We may even have an infinite subset A of an uncountable set B
where A is countable — for instance, let A = Z and B = R.
3) Following Theorem A3.3 we remarked that whenever we had a set A and could find a one-to-
one function f: A — Z*, then the set A had to be countable. We cannot reverse the roles of
A and Z* for the function f. If there is a one-to-one function g: Z* — A, the set A could be
uncountable. Just consider g: Z* —» R where g(x) = x foreach x € Z".
4) Consider the points in the Cartesian plane on the unit circle x? + (y — 1)? = 1. How large is
this set S = {(x, y)|x, y € Rand x” + (y — 1)? = 1} —thatis, is S countable or uncountable?
In Fig. A3.1 we have a unit circle (in the plane) centered at C(O, 1). This circle is tangent to
the real number line (or x-axis) at the point where x = 0. The point P, on the circumference,
has coordinates (0, 2).
Figure A.3.1
Let (x, y) be any point on the circumference of the unit circle, other than the point P (0, 2).
For example, point Q is one such point, and R is another. Draw the line determined by P and
Q. This line intersects the x-axis at Q’. Likewise the line determined by P and R intersects the
x-axis at R’. Conversely, consider the points on the x-axis — except for the point where x = 0.
Two such points are 7’ and U’. The line through P and 7” intersects the unit circle at 7. Point U
is the point of intersection (on S) determined by the line through P and U’. Finally, correspond
Appendix 3 Countable and Uncountable Sets A-29
P with P’ (on the x-axis where x = 0). In this way we obtain a one-to-one correspondence
between the elements of S and the set R. Hence |S| = |R|, so S is another uncountable set.
Summarizing what we now know about |Z| and |R| — namely, that |Z| < |R| —we now want
to determine whether |Q| = |Z| or |Q| = |R| or, perhaps, |Z| < |Q| < |R|. In accomplishing this we
shall prove something more general; to do so we start with the following.
THEOREM A3.5 The set Z* X Z* is countable.
Proof: Define the function f:Z* X Z* > Z* by f(a, b) = 243°. The result will follow if we can
show that f is one-to-one. For (m,n), (u,v) €Z* X Zt, f(m,n) = fu, v) > 23" = 23" >
m =u, n =v, by the Fundamental Theorem of Arithmetic. Consequently, f is one-to-one and
Z* X Z* is countable.
Before any statements can be made about the size, or cardinality, of Q, we first need to consider
the subset QM (0, 1] = {s|s € QandO <s < 1} of Q.
THEOREM A3.6 The set QM (0, 1] is countable.
Proof: First we must agree that each s in QM (0, 1] will be written in the (unique) form p/q, where
p,q €Z* and have no common divisor other than 1. Now define f:QQ (0, 1] Z* X Z by
f (p/q) = (p,q), andlet K = range f. For p/q, u/v € QM (0, 1], we find that f(p/q) = f(u/v) >
(p,q) = (4, v) > p =u and g =v=> p/q =u/v, so f is a one-to-one function. Consequently,
Q/” (0, 1] ~ K, asubset of the countable set Zt < Z*. From Theorem A3.3 it now follows that the
set Q 1 (0, 1] is countable.
As we continue in our efforts to determine |Q| we shall need the next two definitions and theorem.
Definition A3.7 Let ¥ be any collection of sets from a universe “tL. The union of all the sets in 4, written LU) Ace A,
is defined as {x|x € U and x € A, for some A € ¥}.
When & is a countable collection — that is, # = {A,, Az, A3, .. .} — we may write UW ae5 A=
Ue, An = U) ez+ An.
In each of the following the universe “U is R.
EXAMPLE A3.7
a) For eachn € Z* let A, = [n — 1, n). Then, for example, A, = [0, 1), Ao = [1, 2), and A3 =
[2, 3). For ¥ = {Aj, Az, A3,...} = {A,|i € Z*} we find that U,cg A= US, A,
= Users An = 10, +00).
b) Given any g € Q™ let Ay = (¢ —1/2,¢ + 1/2). Here, for instance, Aj2 = (0,1), Ag =
(7/2, 9/2), and Aj); = (19/6, 25/6). If F = {Aglq €Q*), then Ujeg A = Uj cgr Ag =
(—1/2, +00).
Definition A3.8 Let ¥ be a collection of sets each taken from a universe UU. The collection & is called a disjoint
collection if for all A, Bin ¥, when A # B then ANB = @.
When we reexamine the two collections in Example A3.7 we find that the collection in
EXAMPLE A3.8 part (a) is the only disjoint collection.
A-30 Appendix 3 Countable and Uncountable Sets
The concepts of a countable set and a disjoint collection of sets now come together in our next
result.
THEOREM A3.7 Let ¥ be a countable disjoint collection of sets, each of which is countable. Then U Acg Ais alsoa
countable set.
Proof: Since % is a countable disjoint collection, we may write ¥ = {A,, A2, Az, ...}, where
A; A, = @ for all i,j € Z*, when i # j. Furthermore, for each n € Z*, A, is countable and can
be expressed as {dn1, Gn2, Gn3, -. .}, a Sequence of distinct terms. In order to show that U acm A is
countable, consider each x € U Ace A.
Since U neg A= LU Ans we havex & A, for some (fixed) n € Z*, and this n is unique because
F is adisjoint collection. In addition, x € A, => X = Gy, forsomek € Z* (where k is fixed and unique).
Now define f: Uyeg A > Z* X Zt by f(x) = f (a,x) = (n, k). From Theorem A3.5 we know that
Z* X Z* is countable, so the range of f is countable. Consequently, the result will be established
once we show that f is one-to-one. This readily follows, for if x = aan, y = @pg € UU Aaegx A with
f(x) = fQ), then f(ani) = f(Gpq) > (0, &) = Cp, gq) 1 = Pykh = G > Ank = Ayg X= Y.
Note that the proof of Theorem A3.7 is valid if ¥ is finite (and oo is replaced by |#|) or if one or
more of the sets A,, i € Z”, is finite.
As aresult of Theorem A3.7 we can now deal with the cardinality of Q.
THEOREM A3.8 The set Q (of all rational numbers) is countable.
Proof: We start by recalling that Ag = QM (0, 1] is countable — from Theorem A3.6. Now for each
nonzero integern, let A, = QM (n, n + l]anddefine f,: A, — Ao by f(g) = q — n. Then f,,(q1) =
fn(Q2) > 41 —N = Go —N = qi = G2, 80 f, is one-to-one. Consequently, A, ~ f,(An) © Ao, and by
Theorem A3.3 we have A, countable. In addition, for all m,n € Z,m An => Am OA,y = BY. From
Example A3.3 we know that Z is countable, so # = {Ao, A), A-1, Az, A-2,...} is a countable
disjoint collection of countable sets. Therefore, by virtue of Theorem A3.7, it follows that U Ace A=
U ez A, = Q is countable.
So now we know that Z*, Z, and Q are all infinite and Z* ~ Z ~ Q while R is infinite and
R ~ Z*. Recall that any infinite set A, where A ~ Z*, is said to be countably infinite— and we shall
now denote the cardinality of such a set A by writing |A| = Xo, using the Hebrew letter aleph, with
the subscripted 0, to designate the first level of infinity. The cardinality of R is greater than Xo and is
usually denoted by c, for the continuum.
In our next theorem we shall improve upon the result in Theorem A3.7. The following lemma
helps with the improvement.
LEMMA A3.1 Let # = {A,, Az, A3. ...} be any countable collection of sets (from a universe U). Let G = {B,. By,
B3,...} be the countable collection of sets where B, = A, and B, = A, — Ur A, forn > 2. Then
G is a countable disjoint collection and Ue, A, = Ue By.
Proof: First we establish that the countable collection % is disjoint. To do so we must show that for
all i,j € Z*, where i # j, we have B; 1 B, = U. If not, leti < j with B, B, # Y. Forx € B, OB,
we find that x ¢ B, = A, — Ue Ay =>x ¢ A,, because 1 <i < j — 1. But it also happens that
x€B, =A; Ui Ay => x € A, because A; — Ul A; C A,. (Note: Ul A, = 4 when i = 1.)
Appendix 3. Countable and Uncountable Sets A-31
The contradiction —x ¢ A, and x € A, —tells us that B, 1 B, = for alli, j € Z*, where i # j. So
§ is a disjoint countable collection of sets.
© oo . oo
For the second
part — namely, that Ue, Ay = U, B, —start with x € Ue, Ax. Thenx € A,
for some n € Z*, and let m denote the smallest such n. If m = 1, then x € A; = B, C Ur B,. If
m-1 DO
m>1,thenx ¢ A, forall 1 <j <m— 1, andsox
€ A, — Un, Ay = Bn CS Ur B,. In either
casex € Ue, B, and Ue, Ay Sc UT B,. For the opposite inclusion we find that y € UU B=
y € B,, for some (unique) n € Z* => y € Ay, for this same n € Z*, because B, = A, and B, = A, —
i-l oo oo
UU A, CA,,foralli > 2,.Then y € A, > ye Ur, Ax, $0 UT, B.S Ue, A,. Consequently,
Ue, Ax = Un, By.
As in the case of Theorem A3.7, the proof of Lemma A3.1 is valid if ¥ is finite (and oo is then replaced
by |¥\).
From Lemma A3.1 we learn that the hypothesis of Theorem A3.7 can be weakened — the countable
collection ¥ need not be disjoint. This is formally established as follows.
THEOREM A3.9 The union of any countable collection of countable sets is countable.
Proof: If & = {A,, A>, A3, ...} is a countable collection of countable sets, construct the countable
collection § = {B,, Bo, Bz, ...} as in Lemma A3.1. For each k € Z*, B, © Ax, so by Theorem A3.3
each B, is countable. Lemma A3.1 tells us that Ue, Ax = Ue, B,, and from Theorem A3.7 we
know that Ue, B, is countable. Hence U Ace A= Ue, A, is countable.
Once again, should ¥ be finite, the proof of Theorem A3.9 remains valid (upon replacing each
occurrence of 00 by |#|).
Following Theorem A3.8 we mentioned that |Z*| = Xp and |R| = c, where Xp < c. Although there
is still a great deal more that can be said about infinite sets, we shall close this appendix by showing
that these are not the only infinite cardinal numbers. In fact, there are infinitely many infinite cardinal
numbers.
THEOREM A3.10 If A is any set, then |A| < |9P(A)].
Proof: If A = @, then |A| = 0 and |P(A)| = |P(A)| = |{H}| = 1, so the result is true in this case. If
A #W, let f: A~» P(A) be defined by f(a) = {a} for each a € A. The function f is a one-to-one
function and it follows that |A| = | f(A)| < |P(A)|. To show that |A| # |P(A)| we must prove that no
function g: A ~» P(A) can be onto. So let g: A > P(A) and consider B = {ala € A anda ¢ g(a)}.
Remember that-g(a) C A and that B C A. With B € P(A), if g is to be an onto function there must
exist a’ € A such that g(a’) = B. Now do we have a’ € g(a’) ora’ ¢ g(a’)? Exactly one of these two
results must be true.
If a’ € g(a’) = B, then from the definition of B we have a’ ¢ g(a’) —and the contradiction:
a’ € g(a’) and a’ ¢ g(a’). On the other hand, when a’ ¢ g(a’) then a’ € B —but B = g(a’). Once
again we get the same contradiction.
Therefore, there is no a’ € A with g(a’) = B, so g cannot be onto, and hence |A| < |P(A)].
As a consequence of Theorem A3.10 we find that there is no largest infinite cardinal number. For
if A is any infinite set, then |A| < |P(A)| < |P(P(A))| <---. However, there is a smallest infinite
cardinal number. As we mentioned earlier, this is Xo.
A-32 Appendix 3 Countable and Uncountable Sets
REFERENCES
Since there is still more that can be said about countable and uncountable sets, the interested reader
may want to examine one of the following for further information.
1, Enderton, Herbert B. Elements of Set Theory. New York: Academic Press, 1977.
2. Halmos, Paul R. Naive Set Theory. New York: Van Nostrand, 1960.
3. Henle, James M. An Outline of Set Theory. New York: Springer-Verlag, 1986.
b) Find a one-to-one correspondence between Z* and
EXERCISES A.3 {2, 6, 10, 14, ...}.
1. Determine whether each of the following statements is true 3. Let A, B be sets with A uncountable. If A C B, prove that
B is uncountable.
or false. For parts (d)-(g) provide a counterexample if the state-
ment is false. 4. Let / = {r € RIr is irrational} = R — Q. Is / countable or
a) The set Q* is countable. uncountable? Prove your assertion.
b) The set R* is countable. 5. If S, T are infinite and countable, prove that § X T is count-
able.
c) There is a one-to-one correspondence between the sets N
and 2Z = {2k|k € Z}. 6. Prove that Z* X Z* X Z* = {(a, hb, Ola, b, cE Z*} is
countable.
d) If A, B are countable sets, then A U B is countable.
7. Prove that the set of all real solutions of the quadratic equa-
e) If A, B are uncountable sets, then A M B is uncountable.
tions ax? + bx + ¢ = 0, where a, b, c € Z, a # 0, is a count-
f) If A, B are countable sets, then A — B is countable. able set.
g) If A, B are uncountable sets, then A — B is uncountable. 8. Determine a one-to-one correspondence between the open
2. a) Let A = {n?|n € Z*}. Find a one-to-one correspondence interval (0, 1) and the open intervals (a) (0, 3); (b) (2, 7); and
between Z* and A. (c) (a, b), where a, b€ Randa <b.
Solutions
Chapter 1
Fundamental Principles of Counting
Sections 1.1 1. a) 13. +b) 40 © ec) Therule of sum in part (a); the rule of product in part (b)
and 1.2—p. 11 3. a) 288 b) 24
5. 2xX2X1X 10 X 10 X 2 = 800 different license plates
7. 2° 9. a) (14)(12) = 168 — b) (14)(12)(6)(18) = 18,144 — ce): 73,156,608
11. a) 124+2=14 b) 14x 14=196- ec) 182
13. a) P(8,8)=8! b) 7! 6! 15. 4! = 24
17, Class A: (27 — 2)(24 — 2) = 2,113,928,964
Class B: 2!4(2'© — 2) = 1,073,709,056
Class C: 2!2(28 — 2) = 1,040,384
19, a) 7!= 5040 ib) (4')(3'!) = 144 © ce) (5935 =720 ~~ d) 288
21. a) 12!/(3!2!2!2!) — b) 2[11!/(3'2!2!29] oe) [7!/(21 2D) [6!/GB! 2)]
23. 12!/(4!3! 2! 3!) = 277,200 25.a)n=10 b)n=5 oc) n=5
27. a) (10')/(2!7!) = 360 ib) 360
c) Let x, y, and z be any real numbers and let m, n, and p be any nonnegative integers.
The number of paths from (x, y, z) to (x +m, y +n, z+ p), as described in part (a), is
(m+n4+ p)!/Qntal pt).
29, a) 576 b) The rule of product
31. a)9X9X8X7X6X5= 136,080 b) 9X 10°
(i) (a) 68,880 (b) 450,000
(ii) (a) 28,560 (b) 180,000
(iii) (a) 33,600 (b) 225,000
33. a) 2!° pb) 3° 35. a) 6! ~——b) 2(5!) = 240
37. (]§)9! 5! = 348,713, 164,800
Section 1.3—p. 24 1. (5) = 6!/(2! 4!) = 15. The selections of size 2 are ab, ac, ad, ae, af , bc, bd, be, bf, cd, ce, cf,
de,df,andef.
3. a) C(10, 4) = 10!/(416'!) = 210 —b) (7) = 12!/(7! 5!) = 792
ec) C(14,12)=91 — d) ({3) = 3003
- a) P(S, 3) = 60
b) af.m af,r af,t a,m,r a,m,t
a, r,t frm,r f,m,t f,rt m, r,t
a) (75) = 125,970 b) (P)(2) = 44,100 e) D0%_, (10!89,)
(27)
d) ee (°°) 12” ,) e) yes C22)
.a) (§)=28 b) 70 c) (3)=28 dd) 37
11. a) 120 b) 56~ ce) 100
13.
15.
(;) x)=
a) (3) =105 — b) (%)
= 2300; (9); ) = 12,650 |
17. ® Diag 8 Diaepee=DLicnewe a Lig
19. (8) + CIVG) + () = 220, (2) + (2) + (PG) + (VG) = 705
21° (Yeo G))
S-2 Solutions
21, (3) (5) -n—n(n—4),n>4
23. a) (§) —b) (F)23)_—e)- (2) (2%)(-3)
25. a) (f2)=12 bi 12 ¢) (,45)(2)(—1)(-1)? = —24
d) —216 ee) (,.°,,)(2°)(—1)?(3)(—2)? = 161,280
27. a) 2° b) 2!° c) 3!° d) 4 e) 4!°
m+n\ (m+n)! (m+n)! _ (m+n)!
29. n( m ) —" int oman PG te pond@ dD!
ome
~ (+ DT iim 1)! mann!
m+n
31. Consider the expansions of (a) [(] + x) — x]"; (b) [(2 + x) — (x + 1)]"; and
(c) [(2+x)—x]".
33. 1
Section 1.4—p. 34 La) DQ 9() 3@ Sa wD
7a) (8) b GH) o &) #1 e& (8) 9 G)-©
9n=7 Ia) (f) by (#) +3(2) +38) + @)
13. a) (7) by) 2. G23) 5. G8)(24— «7. a) (18) 5?
19. (7?) 21. 24,310=)0"_,i [forn = (3)]
23. a) Place one of the m identical objects into each of the n distinct containers. This leaves m — n
identical objects to be placed into the n distinct containers, resulting in
(" rm ') = ("=") = ("=}) distributions.
25. a) 2° _b) 24
27. a) Cty-')=4 b) 10 c) 48 d) CT DCB YF Oty CTE!) = 96
e) 180 f) 420
2n _ 2n \ _ (2n)! _ (2n)! — (Qn)!+ 1) (2n)\n
Section 1.5-p. 40 1.
(7) (,",)- nin! (n—Iin+1! | (n+ 1)!n! nln +1)!
(2n)'[(n+1)—n] _ 1 (Qn)! 1 2n
(n+ 1)tn! = cepa (GH (*")
» a) 5 (= b3); 14 (= by)
1 72
b) For n > 0 there are b, (= ——(“" ) ) such paths from (0, 0) to (n, 2).
(n4+1)\n
c) For n > 0 the first move is U and the last is R.
. Using the results in the third column of Table 1.10 we have:
111000 110010 101010
123 125 135
456 346 246
. There are bs(= 42) ways.
. (1) When n = 4 there are 14 (= 64) such diagrams.
(11) For each n > 0, there are b, different drawings of n semicircles on and above a horizontal
line, with no two semicircles intersecting. Consider, for instance, the diagram in part (f) of
Fig. 1.10. Going from left to right, write 1 the first time you encounter a semicircle and write 0
the second time that semicircle is encountered. Here we get the list 110100. The list 110010
corresponds with the drawing in part (g). This correspondence shows that the number of such
drawings for n semicircles is the same as the number of lists of » 1’s and n 0’s where, as the list
is read from left to right, the number of 0’s never exceeds the number of 1’s.
11. (;) (;) (6!)(6!) = (5) (12!) = 68,428,800
7 6 o 7 a
Supplementary 1. ()G) + G)@ + ()@)
Exercises—p. 43 3. Select any four of these twelve points (on the circumference). As seen in the figure, these points
determine a pair of chords that intersect. Consequently, the largest number of points of
Solutions S-3
intersection for all possible chords is ('7) = 495.
a) 10% =
--- 4)(12)
b) (10)(11) 34/9!) (25 (3)
7, a) C(12,8) b) P12, 8) 9. a) 12 b) 49
11. (1/11) [11!/(5! 3! 34]
3.9 OH+OO+O GH+OO+O GH+OO+O-
b OG) +GG) Gi and Gi) ()(G) + GG)
15. a) 2(4)+ (@) = 343 ~~ b) [2(7) — 9] + (2)— 1] = 1200
17. a) (5)(Q!) — b) (3)(8!)
9. a) (7b) 20) + OA
21. 0= (1+ (—1)" = (t) — (7) +) — G) +--+ CDG), 80
(+ G+
@ t= O+Q+@)+--
23. a) P(20, 12) = 201/8! —b) (7) (12/)
25. a) (1) + (3) +--+ 09) + G9) = Veo (ae) BY Vo Cx")
c) n=2k+1,k>0: 0%, FUT)
n= 2k k= TK, MF)
27. a) (oP') = a)
= (021)
b) ra Gi =Co dt Cp pte + G2) = 2
29. a) 11!/(7!49) by) [IL/(7! 49] — F41/(2! 2 1141/3! 1)
¢) [11/71 4D] + [10!/(6! 3! 1]+ [9!/(S! 2! 2!) + (81/4! 1139]+ (71/03! 49] [in part (a)]
(L111/(7! 49] + [101/(6! 3! 1] + [91/(S! 2! 2] + (81/4! 1139] + (71/8! 491}
— [{[4!/(2! 29] + B/C! 1! 1] + [2!/2']} & {[41/G! 1D] + (31/2! 1D1}]
[in part (b)]
31. (3)(8) =540 33. ($)(12)(11)
(10) 9) = 178,200
Chapter2
Fundamentals of Logic
Section 2.1~p. 54 1. The sentences in parts (a), (c), (d), and (f) are statements. The other two sentences are not.
3.a)0 b)O oc)1 dO
5. a) If triangle ABC is equilateral, then it is isosceles.
b) If triangle ABC is not isosceles, then it is not equilateral.
d) Triangle ABC is isosceles, but it is not equilateral.
7. a) If Darci practices her serve daily then she will have a good chance of winning the tennis
tournament.
b) If you do not fix my air conditioner, then I shall not pay the rent.
c) If Mary is to be allowed on Larry’s motorcycle, then she must wear her helmet.
9, Statements (a), (e), (f), and (h) are tautologies.
11. a) 2=32 ~=~b) 2” 13. p:0;7r:0; 5:0
15. a)m=3,n=6 b) m=3,n=9 c) m=18n=9 adAm=4,n=9
e) m=4,n=9
17. Dawn
S-4 Solutions
Section 2.2~p. 66 1. a) (i) Pl\@iriqar| po@ar) | pog| por| poga(pen)
o;/olo}] o 1 1 1 1
0] 0] 1 0 I 1 1 1
QO} 140 0 | 1 1 1
QO} 1 1 1 1 1 1 1
1|/0]0 0 0 0 0 0
1/0] 1 0 0 0 1 0
] 1 | 0 0 0) 0 0
] 1 1 1 l 1 1 1
(ili)
P\|q|(riqvrjp>@vr) | p>g | -~r> (p> q)
0} 07] 0 0 l ] 1
0; 0] 1 1 1 ] ]
0} 140 1 1 1 1
0 ] 1 1 1 ] 1
1/01] 0 0 0 0 0
1/0] 1 1 1 0 1
] ] 0 ] 1 1 ]
1 1 ] 1 1 1 1
b) [p> (¢Vr)] <= I[-r - (p> q)] From part (ili) of part (a)
<> [or - (=p Vv q)] By the 2nd Substitution Rule,
and (p > q) <=> (=p vq)
= [-(-p V gq) > 7-77]
By the 1st Substitution Rule,
and (s > t) <> (-t > —s) for any
primitive statements s, ¢
<> [(--p A7g) > +] By DeMorgan’s Law, Double Negation,
and the 2nd Substitution Rule
<> [(p A7q) > 7] By Double Negation and the
2nd Substitution Rule
3. a) For any primitive statement s, s V —s <=> 7. Replace each occurrence of s by p V (¢ Ar),
and the result follows by the Ist Substitution Rule.
b) For any primitive statements s, t, we have (s > f) <> (-—t - -s). Replace each
occurrence of s by p v q, and each occurrence of 1 by r, and the result is a consequence of the
lst Substitution Rule.
5. a) Kelsey placed her studies before her interest in cheerleading, but she (still) did not get a
good education.
b) Norma is not doing her mathematics homework or Karen is not practicing her piano lesson.
c) Harold did pass his C++ course and he did finish his data structures project, but he did not
graduate at the end of the semester.
7. a)
P|qd | pVqA(pA(pag@)) | pag
0] 0 0 0
0 | 1 0 0
1 |0 0 0
1] 1 1 1
b) (=pAq)Vv
(py (pv q)) = pvg
9, a) If0+0=0, then 1 + 1 = 1, (FALSE)
Contrapositive: If 1+ 1 #4 1, thenO0 +0 # 0. (FALSE)
Converse: If 1+ 1 = 1, then 0 + 0 = 0. (TRUE)
Inverse: If0+0 £0, then 1 +1 4 1. (TRUE)
Solutions 8-5
b) If —1 <3 and3+7 = 10, then sin (2) = —1. (TRUE)
Converse: If sin (3) = —1, then —1 <3 and 3+ 7 = 10. (TRUE)
Inverse: If —1 > 3 or3 +7 # 10, then sin (=) # —1. (TRUE)
Contrapositive: If sin (=) # —1, then -1 > 30r3+7 4 10. (TRUE)
11. a) ¢q>r)Vv—p_ b) (-qvr)Vvo7p
13.
[pe ga@qer)aAtreop)) | (pr gQagoryatr—
p))
ss
™
|}
>
OCOoOm
Oooocooo
HB rPoorHco
Or
coo
oF
Oooo
rRrer
Or
SR
Fe
Se
e
15. a) (ptp) by) (ptp)t@tga oo wt@atrt@g 4d) pt@t”
e) (r ts) t (r ts), where r stands for p ¢ (g 7g) ands forg t+ (p ft p)
17.
P{q|~Pl@) | opto | -Oot®@ | Gpel-9
0 | 0 0 0 0 0
0 | 1 1 1 0 0
1] 0 1 1 0 0
1} 1 ] 1 1 1
19. a) pV[pA(pVv4q)] Reasons
<> pvp Absorption Law
<= p Idempotent Law of Vv
ce) [((-pVv 79g) > (pAgAr)] Reasons
= -(4-pV 7g) V(pPAGATr) soreactvl
>} (4p Am) V(pAGANr) DeMorgan’s Laws
S(PAGV(PAGATr) Law of Double Negation
= PpAg Absorption Law
Section 2.3-p. 84 1. a)
Pla|{r|p>q|@vq@ \@vgor
0 | 0] 0 1 0 1
0| 0] 1 1 0 1
0; 10 1 ] 0
O} 141 1 1 I
1/ 0/0 0 ] 0
1} 0; 1 0 1 1
1} 140 1 ] 0
1} 1] 1 1 1 1
The validity of the argument follows from the results in the last row. (The first seven rows may
be ignored.)
S-6 Solutions
c)
P\|qairijqavr|pv@vr)|—7q | pvr
010]0 0 0 1 0
0o1o]1 1 1 1 1
0o}11/0 | 1 0 0
O}1]1 | 1 0 I
1/0/90 0 1 1 1
1/0] 1 1 1 1 1
1/110 1 1 0 1
1/1] 1 1 1 0 1
The results in rows 2, 5, and 6 establish the validity of the given argument. (The results in the
other five rows of the table may be disregarded.)
. a) If p has the truth value 0, then so does p A q.
b) When p v g has the truth value 0, then the truth value of p (and that of g) is 0.
c) If gq has truth value 0, then the truth value of [(p Vv g) A —p] is 0, regardless of the truth
value of p.
d) The statement g Vv s has truth value 0 only when each of g, s has truth value 0. Then
(p — q) has truth value 1 when p has truth value 0; (r -» s) has truth value 1 when r has truth
value 0. But then (p V r) must have truth value 0, not 1.
. a) Rule of Conjunctive Simplification
b) Invalid — attempt to argue by the converse
¢) Modus Tollens
d) Rule of Disjunctive Syllogism
e) Invalid — attempt to argue by the inverse
- 1)and2) ‘Premise
3) Steps (1) and (2) and the Rule of Detachment
4) Premise
5) Step (4) and (r > 74) <=} (7-79 — 7-r) = (¢ > 77)
6) Steps (3) and (5) and the Rule of Detachment
7) Premise
8) Steps (6) and (7) and the Rule of Disjunctive Sylogism
9) Step (8) and the Rule of Disjunctive Amplification
- a)
1) Premise (The Negation of the Conclusion)
2) Step (1) and -=(-g —> s) <> 7(-799 V 5) <=} 7(g V 5) <3 7g Aas
3) Step (2) and the Rule of Conjunctive Simplification
4) Premise
5) Steps (3) and (4) and the Rule of Disjunctive Sy!logism
6) Premise
7) Step (2) and the Rule of Conjunctive Simplification
8) Steps (6) and (7) and Modus Tollens
9) Premise
10) Steps (8) and (9) and the Rule of Disjunctive SyNogism
11) Steps (5) and (10) and the Rule of Conjunction
12) Step (11) and the Method of Proof by Contradiction
b) 1) p>g Premise
2) -q-> 7p ___ Step (1) and (p > q) = (—q > —p)
3) pvr Premise
4) -p>r Step (3) and (p Vr) = (7p > r)
5) -~g-or Steps (2) and (4) and the Law of the Syllogism
6) -—rvs Premise
7) r->s Step (6) and (~r Vs) = (r > 5)
8) ..-g-> 5s Steps (5) and (7) and the Law of the Syllogism
Solutions S-7
11. a) p:1 ¢:0 rl c) p,g.r 5:0
b) p:0 ¢:0 r:Oorl d) p.g.r:1 s:0
p:0 glo ord
13. a)
pP\|q{ripvq|napyr | (pvygaaCcpyvr) | avr | KCev@aaACpvnl
> @vr)
0/0] 0 0 1 0 0 |
0); 04] 1 0 1 0 1 1
0/1/00 1 1 1 1 l
O{1)1 1 1 1 l l
1; 070 l 0 0 0) 1
1} 0) 1 1 1 1 1 1
1] 170 I 0 0 1 1
1/1 ]1 1 l l l 1
From the last column of the truth table it follows that [(p V gq) A(mp Vr)] > (¢Vr)isa
tautology.
b) (i) Steps Reasons
1) pv(q@aAr) Premise
2) (pVqQA@vr) Step (1) and the Distributive Law of Vv over A
3) pvr Step (2) and the Rule of Conjunctive Simplification
4) pos Premise
5) -pVs Step (4), p> s@rpvs
6) rvs Steps (3), (5), the Rule of Conjunction, and Resolution
(iii) Steps Reasons
1) pvq Premise
2) por Premise
3) a=pvr Step (2), p>rqa-rpvr
4) [(pVg)ACpyvr)] Steps (1), (3), and the Rule of Conjunction
5) qvr Step (4) and Resolution
6) rs Premise
7) —rVs Step (6), r->s<@q-7rvs
8) [7 v@gA(rrvs)] Steps (5), (7), the Commutative Law of v, and the Rule of
Conjunction
9) avs Step (8) and Resolution
(iv) Steps Reasons
1) -~pvVqvr Premise
2) gqV(—pvr) Step (1) and the Commutative and
Associative Laws of V
3) ~—g Premise
4) -qgvV(-pvr) Step (3) and the Rule of Disjunctive
Amplification
5) [IgV (sp Vvr)JAl-¢ Vv (Apvr)]] Steps (2), (4), and the Rule of Conjunction
6) (7pvr) Step (5), Resolution, and the Idempotent
Law of A
7) —r Premise
8) -—rv-p Step (7) and the Rule of Disjunctive
Amplification
9) [(r Vap)A(-rv mp) Steps (6), (8), the Commutative Law of v,
and the Rule of Conjunction
10) ..-=p Step (9), Resolution, and the Idempotent
Law of v
S-8 Solutions
c) Consider the following assignments.
p: Jonathan has his driver’s license.
q: Jonathan’s new car is out of gas.
r: Jonathan likes to drive his new car.
Then the given argument can be written in symbolic form as
—“pVd
pV-7r
—qVv-r
or
Steps Reasons
1) ~pv@q Premise
2) pv-r Premise
3) (pv -7r) A (-p Vv q) Steps (2), (1), and the Rule of Conjunction
4) -rvq Step (3) and Resolution
5) qv-r Step (4) and the Commutative Law of v
6) —~gV-r Premise
7) (qV~mr) A (-q¢ Vv 7r) Steps (5), (6), and the Rule of Conjunction
8) =r Vv -r Step (7) and Resolution
9) olor Step (8) and the Idempotent Law of v
Section 2.4—p. 100 . a) False b) False cc) False dd) True’ e) False’ f) False
. Statements (a), (c), and (e) are true, and statements (b), (d), and (f) are false.
oe)
a) dx [m(x) A c(x) A j(x)] True
b) Ax [s(x) A c(x) A >m(x)] True
c) Vx [c(x) > (n(x) ¥ p(x))] False
d) Wx [(g(4) A e(x)) > p(x), or True
Vx [(p(x) A c(x)) > mg(x)], or
Vx [(g(x) A p(x)) > me(x)]
e) Wx [(c(x) A s(x)) > (p(x) Y e(x))] True
- a) (i) Ax g(x)
(ii) Ax [p(x) Aqg(x)]
(iii) Vx (g(x) > =F)
(iv) Wx [¢(x) > 71(x)]
(v) Ax [gQx) At@)]
(vi) Vx [(¢(x) Ar(x)) > s(x)]
b) Statements (i), (ii), (v), and (vi) are true. Statements (iii) and (iv) are false; x = 10 provides
a counterexample for either statement.
c) (i) Ifx is a perfect square, then x > 0.
(ii) Ifx is divisible by 4, then x is even.
(iii) If x is divisible by 4, then x is not divisible by 5.
(iv) There exists an integer that is divisible by 4, but it is not a perfect square.
d) (i) Letx = 0. (iii) Let x = 20.
- a) (i) True __ (ii) False Considerx = 3,
(iii) True (iv) True
c) (i) True (ii) True
(iii) True (iv) False
For x = 2 or S, the truth value of p(x) is 1
while that of r(x) is 0.
11. a) In this case the variable x is free, while the variables y, z are bound.
b) Here the variables x, y are bound; the variable z is free.
13. a) p(2, 3) A p(3, 3) A p65, 3)
b) [p(2, 2) V p(2. 3) v p(2, 5)] Vv [pG, 2) v pG, 3) Vv pG, 5)] Vv [p65, 2) v pt, 3) Vv pG. 5)]
Solutions §-9
15. a) The proposed negation is correct and is a true statement.
b) The proposed negation is wrong. A correct version of the negation is: For all rational
numbers x, y, the sum x + y is rational. This correct version of the negation is a true statement.
d) The proposed negation is wrong. A correct version of the negation is: For all integers x, y, if
x, y are both odd, then xy is even. The (original) statement is true.
17. a) There exists an integer n such that n is not divisible by 2 but n is even (that is, not odd).
b) There exist integers k, m, n such that k — m and m — n are odd, and k — n is odd.
d) There exists a real number x such that |x — 3| < 7 and either x < —4 or x > 10.
19, a) Statement: For all positive integers m, n, if m > n, then m? > n?. (TRUE)
Converse: For all positive integers m, n, if m? > n’, then m > n. (TRUE)
Inverse: For all positive integers m, n, ifm <n, then m? <n?. (TRUE)
Contrapositive: For all positive integers m, n, if m? <n?, then m <n. (TRUE)
b) Statement: For all integers a, b, ifa > b, then a* > b?. (FALSE— let a = 1 and b = —2.)
Converse: For all integers a, b, if a* > b’, then a > b. (FALSE— let a = —5 and b = 3.)
Inverse: For all integers a, b, ifa < b, then a* < b?. (FALSE—let a = —5 and b = 3.)
Contrapositive: For all integers a, b, if a* < b*, then a < b. (FALSE—let a = 1 and
b = —2.)
¢) Statement: For all integers m, n, and p, if m divides n and n divides p, then m divides p.
(TRUE)
Converse: For all integers m and p, if m divides p, then for each integer n it follows that m
divides n and n divides p. (FALSE — let m = 1, n = 2, and p = 3.)
Inverse: For all integers m,n, and p, if m does not divide or n does not divide p, then m
does not divide p. (FALSE
— let m = 1, n = 2, and p = 3.)
Contrapositive: For all integers m and p, if m does not divide p, then for each integer x it
follows that m does not divide n or n does not divide p. (TRUE)
e) Statement: Wx [(x? + 4x — 21 > 0) > [(x > 3) V & < —7)]] (TRUE)
Converse: Wx [[(x > 3) V (x < -7)] > (a? + 4x — 21 > 0)] (TRUE)
Inverse: Vx [(x? + 4x — 21 <0) > [(« <3) A @ > —7) II, or Wx [(0? + 4x — 21 <0) 5
(—7 <x <3)] (TRUE)
Contrapositive: Wx [[(x <3) A (x => —7)] > (x? + 4x — 21 <0)], or Vx [(—7 <x <3) >
(x? + 4x — 21 <0)] (TRUE)
21. a) True’ b) False cc) False d) True e) False
23. a) Va db[a+b=b+a=0] b) duValau=ua=a)] ec) Va 05h [ab = ba = 1]
d) The statement in part (b) remains true, but the statement in part (c) is no longer true for this
new universe.
25, a) dx Ay[Y>y)Atw-—y<O0)] bd) Avay[@<y)AWe[x>zvz>I]
Section 2.5—p. 116 . Although we may write 28 = 25+1+1+4+1=16+4+4+4, there is no way to express 28
as the sum of at most three perfect squares.
30
=25+4+1 40 = 36+4 50 = 254 25
32
= 16+ 16 42=25+ 16+ | 52 = 36+ 16
34
= 2549 44 = 364444 54 = 2542544
36
= 36 46 = 364+9+41 56 = 36+ 1644
38
= 36+1+41 48 = 164+ 16+ 16 58 = 49+9
. a) The real number 7 is not an integer.
c) All administrative directors know how to delegate authority.
d) Quadrilateral MN PQ is not equiangular.
. a) When the statement Ax [p(x) v g(x)] is true, there is at least one element c in the
prescribed universe where p(c) V q(c) is true. Hence at least one of the statements p(c), ¢(c)
has the truth value 1, so at least one of the statements dx p(x) and Ax g(x) is true. Therefore, it
follows that dx p(x) V Ax g(x) is true, and dx [ p(x) Vv g(x)] > Ax p(x) Vv Ax g(x).
Conversely, if dx p(x) V dx g(x) is true, then at least one of p(a), q(b) has the truth value 1,
$-10 Solutions
for some a, b in the prescribed universe. Assume without loss of generality that it is p(a). Then
p(a) V q(a) has truth value 1 so Sx [p(x) Vv g(x)] is a true statement, and
dx p(x) Vv Ax q(x) => Ax [p(*) Vv q(x)].
b) First consider when the statement Vx [p(x) A q(x)] is true. This occurs when p(a) A g(a) is
true for each a in the prescribed universe. Then p(qa) is true [as is g(a)] for all a in the universe,
so the statements Vx p(x) and Vx g(x) are true. Therefore, the statement Vx p(x) A Wx q(x) is
true and Vx [p(x) A q(x)] => Wx p(x) A Wx q(x). Conversely, suppose that Vx p(x) A Vx g(x)
is a true statement. Then Vx p(x), Wx g(x) are both true. So now let c be any element in the
prescribed universe. Then p(c), g(c), and p(c) A q(c) are all true. And since ¢ was chosen
arbitrarily, it follows that the statement Vx [p(x) A g(x)] is true, and
Vx p(x) A Wx q(x) = Vx [p(x) A g(x)].
9. 1) Premise
2) Premise
3) Step (1) and the Rule of Universal Specification
4) Step (2) and the Rule of Universal Specification
5) Step (4) and the Rule of Conjunctive Simplification
6) Steps (5) and (3) and Modus Ponens
7) Step (6) and the Rule of Conjunctive Simplification
8) Step (4) and the Rule of Conjunctive Simplification
9) Steps (7) and (8) and the Rule of Conjunction
10) Step (9) and the Rule of Universal Generalization
11. Consider the open statements
w(x): x works for the credit union
é(x): x writes loan applications
c(x): x knows COBOL
q(x): x knows Excel
and let r represent Roxe and / represent Imogene.
In symbolic form the given argument is as follows:
Vi [w(x) > c(x)]
Vx [(w(x) A €(x)) > g(x)]
w(r) A >q(r)
q(i) A mci)
Jw Te(r) A awit)
The steps (and reasons) needed to verify this argument can now be presented.
Steps Reasons
1) Vx [w(x) > c(x)] Premise
2) gli) A 7c) Premise
3) -c(é) Step (2) and the Rule of Conjunctive Simplification
4) w(i) > c(i) Step (1) and the Rule of Universal Specification
5) -wi(i) Steps (3) and (4) and Modus Tollens
6) Wx [(w(x) A £(x)) > g(x)] Premise
7) wr) A-7q(r) Premise
8) ~g(r) Step (7) and the Rule of Conjunctive Simplification
9) (wir) A £(r)) > g(r) Step (6) and the Rule of Universal Specification
10) —(w(r) A €(7)) Steps (8) and (9) and Modus Tollens
11) wr) Step (7) and the Rule of Conjunctive Simplification
12) -w(r) v m£(r) Step (10) and DeMorgan’s Law
13) -£(r) Steps (11) and (12) and the Rule of Disjunctive
Syllogism
14) ». -é(r) A wwii) Steps (13) and (5) and the Rule of Conjunction
Solutions S-11
13. a) Contrapositive: For all integers k and @, if k, @ are not both odd, then ké is not odd — OR,
For all integers k and @, if at least one of k, £ is even, then ké is even.
Proof : Let us assume (without loss of generality) that k is even. Then k = 2c for some
integer c — because of Definition 2.8. Then k£ = (2c)£ = 2(c€), by the associative law of
multiplication for integers — and cé is an integer. Consequently, k£ is even — once again, by
Definition 2.8. (Note that this result does not require anything about the integer @.)
15. Proof : Assume that for some integer n, n? is odd while n is not odd. Then n is even and we may
write n = 2a, for some integer a — by Definition 2.8. Consequently, n? = (2a)? = (2a)(2a) =
(2 .2)(a - a), by the commutative and associative laws of multiplication for integers. Hence, we
may write n? = 2(2a7), with 2a? an integer — and this means that n* is even. Thus we have
arrived at a contradiction, since we now have n* both odd (at the start) and even. This
contradiction came about from the false assumption that n is not odd. Therefore, for every
integer n, it follows that n? odd => n odd.
17. Proof:
(1) Since n is odd, we have n = 2a + 1 for some integer a. Thenn + 11 = (2a+1)+11=
2a + 12 = 2(a + 6), where a + 6 is an integer. So by Definition 2.8 it follows that
n+ I] is even.
(2) If2 + 11 is not even, then it is odd and we have n + 11 = 24 + 1, for some integer b. So
n= (2b+ 1) — 11 = 2b — 10 = 2(b — 5), where 6 — 5 is an integer, and it follows from
Definition 2.8 that nm is even — that is, not odd.
(3) In this case we stay with the hypothesis— that n is odd — and also assume that n + 11 is
not even— hence, odd. So we may write n + 11 = 2b + 1, for some integer b. This then
implies that 2 = 2(b — 5), for the integer b — 5. So by Definition 2.8 it follows that n is
even. But with n both even (as shown) and odd (as in the hypothesis), we have arrived at
a contradiction. So our assumption was wrong, and it now follows that n + 11 is even for
every odd integer 7.
19. This result is not true, in general. For example, m = 4 = 27 andn = 1 = 1° are two positive
integers that are perfect squares, but m + n = 2? + 1* = 5 is not a perfect square.
21. Proof:
We shall prove the given result by establishing the truth of its (logically equivalent)
contrapositive.
Let us consider the negation of the conclusion — that is, x < 50 and y < 50. Then with
x < 50 and y < 50 it follows that x + y < 50 + 50 = 100, and we have the negation of the
hypothesis. The given result now follows by this indirect method of proof (by the
contrapositive).
23. Proof : If n is odd, then n = 2k + 1 for some (particular) integer k. Then 7m + 8 = 7(2k +1) +
8 = 14k +7+8 = 14k 4+ 15 = 144k 4+ 144+ 1 = 2(7k + 7) + 1. It then follows from Definition
2.8 that 7n + 8 is odd.
To establish the converse, suppose that 7 is not odd. Then n is even, so we can write n = 21,
for some (particular) integer rt. But then 7n + 8 = 7(2t) + 8 = 147 +8 = 2(7t +4), soit
follows from Definition 2.8 that 7n + 8 is even— that is, 7n + 8 is not odd. Consequently, the
converse follows by contraposition.
Supplementary
f
Exercises—p. 120
P|@iris|qar|-7svr) | [qAr)o~-aGvr)] | pet
0/0/01] 0 0 1 1 0
O;}0;]0)] 1 0 0 1 0
0; 0] 14] 0 0 0) 1 0
0; 0/1) 1 () 0 l 0
OO] 1 ;]0]0 (0) ] 1 0
O;1/]0] 1 0 0 | 0
Oo; 1}]140 l 0) 0) ]
Oo; 1)])14]1 | 0 0 1
$-12 Solutions
t
pPla@alris|qar|-awsvnr | Iqan-a(svr)] | pet
1/0/01] 0 0 I l 1
1;0/0] 1 0 0 1 1
1}0]140 0 0 l 1
1/}0); 14 1 0 0 l 1
1};1/901]0 0 1 1 1
1] 1]o0]} 1 0 0 l 1
1} 1]140 1 0 0 0
1] 1] 1)] 1 l 0 0 0
3. a)
Plgiri{qeor|peger) | pg | Doger
0; 0; 0 1 0 | 0
0; 0] 1 0 1 1 1
0} 1/0 0 1 0 1
0} 1) 1 ] 0 0 0
1| 01/0 1 1 0 1
1/0} 1 0 0 0 0
1} 140 0 0 1 0
1/1] 1 1 1 ] 1
It follows from the results in columns 5 and 7 that [p @ (gq eo rn)|] <= [(pegqg) <r].
b) The truth value assignments p: 0; g: 0; r: 0 result in the truth value | for [p -> (¢g > r)] and
the truth value 0 for [(p — qg) — r]. Consequently, these statements are not logically equivalent.
5. (1) If Kaylyn does not practice her piano lessons, then she cannot go to the movies.
(2) If Kaylyn is to go to the movies, then she will have to practice her piano lessons.
7. a) (sp V 7q) A (Fov p) A p
b) (=pV—7q) A (Fo Vv p) Ap
= (“pV 7q) A(PA Pp) Fo V p=p
= (“pV 7g) Ap Idempotent Law of A
= pA(mpvn~gq) Commutative Law of A
—S (PAAp)V (pang) Distributive Law of A over v
<> Fov (pA7@) pA7p
=> Fo
<> pA7q Fo is the identity for v.
9. a) contrapositive b) inverse c¢) contrapositive d) inverse e) converse
2) To lair | p¥a | Yar | avr | p¥@Yn
0 | 0 | 0 0 0 0 0
0/0) 1 0 1 1 1
0; 1/0 1 l 1 l
Oo}; 1/1 1 0 0 0
1/0 ;0 1 1 0 l
1/0] 1 1 0 ! 0
1/140 0 0 1 0
1/1] 1 0 1 0 1
It follows from the results in columns 5 and 7 that [(p ¥ g) Yr] <> [p ¥ (¢ VY r)].
b) The given statements are not logically equivalent. The truth value assignments p: 1; ¢: 0;
r: 0 provide a counterexample.
13. a) True b) False c) True’ d) True_ e) False’ f) False g) False ih) True
15. Suppose that the 62 squares in this 8 < 8 chessboard (with two opposite missing corners) can be
covered with 31 dominos. The chessboard contains 30 blue squares and 32 white ones. Each
Solutions S-13
domino covers one blue and one white square— for a total of 31 blue squares and 31 white ones.
This contradiction tells us that we cannot cover this 62-square chessboard with the 31 dominos.
Chapter 3
Set Theory
Section 3.1—p. 134 1. They are all the same set.
3. Parts (b) and (d) are false; the remaining parts are true.
5. a) {0,2} b) {2,23.33,55,75} ¢) {0, 2, 12, 36, 80}
7. a) Vx [xe ASD xXEBJAAx [ve BAX ¢€ A]
b) Ax [x EAAX EB) VVx [x EBV xEA]
OR, dx [x EAAX EB) V Vx [x € B>x EA]
.a) |A!=6 b) |B) =7 ~ c) If B has 2” subsets of odd cardinality, then |B| = 2 + 1.
11. a) 31 b) 30 ¢) 28 13. a) (2) b) @) 9 @+A4+Q4+@)
15. Let W = {1}, X = {{1}, 2}, and Y = {X, 3}.
17. c) IfxeA,thenACB>axeB,andBCC>x €C.Hence ACC. Since B C C, there
exists y € C with y ¢ B. Also, AC Bandy ¢ B => y ¢ A. Consequently, AC C and ye C
wih y¢gA>ACC.
d) Since A C B, it follows that A C B. The result then follows from part (c).
19, a) Forn,k € Z* with n > k + 1, consider the hexagon centered at (7). This has the form
(i-1) x’)
("1) (i)
(*“Z') (iii)
where the two alternating triples— namely, (; i). (; 44 ), (° h ') and
2). CED. (et) —satisty CDG CE = — n k
CED GD.
b) Forn,
ke Zt withn >k +1,
n—| n ) a+ l\ _ (n — 1)! n} (n+ 1)! |
Coley ( k ) = [ghee | lap Borer op
_ (n — 1)! (n+ 1)! ni _ fa-l ( ( n
=a las rope |= ( k ) cat) ea):
21. n= 20
23. The fifth, sixth, and seventh entries in the row for n = 14 provide the unique solution.
25. As an ordered set, A = {x, v, w, z, y}.
27. a) IfS € S, then since S = {A|A ¢ A} we have S ¢ S.
b) If S ¢ S, then by the definition of S$ it follows that S ¢€ S.
Section 3.2—p. 146
1. a) {1,2,3,5} b) A ce) andd) U-{2} e) {4,8}
f) {1,2,3,4,5,8} 9) - h) {2,4,8} i {1,3,4,5, 8}
3. a) A= (1,3.4,7,9, 1} = (2,4, 6,8, 9}
b) C = {1, 2, 4.5, 9} 6185)
5. a) True b) True > True dd) False’ e) True
f) True g) True’ h) False _ 1) False
. a) Let U = {1, 2,3}, A = {1}, B = {2}, and C = {3}. Then ANC = BNC =4 but A FB.
b) For U = {1,2}, A = {1}, B = {2}, andC =U, we have AUC = BUC bit A # B. [From
parts (a) and (b) we see that we do not have cancellation laws for N or U. This differs from what
we know about R, where for a, b, ce R Gi) ab =acanda 40> b=ciand(ija+b=
atco>b=c.]
CO) xE€ASxEAUCDSxEBUC.Soxe Borx eC. Ifx € B, then we are finished. If
xé€C,thenx € ANC = BNC andx € B. Ineither case, x € Bso AC B. Likewise,
S-14 Solutions
ye B>yeBUC=AUC,soyeAoryeC. IfyeC, then ye BNC = ANC. In either
case, ye AandB C A. HenceA = B.
d) Let x € A. Consider two cases: J)xECSx€AACSXEBACSXEB.
(2Qx€EC>BxEAACSxE BAC Dx EB (because x ¢ C), In either case, x € B, so
A © B. Ina similar way it follows that B C A and A = B.
7.1
~a) B=(AUB)N(AUB)N(AUB)N(AUB) b) A=AU(ANB)
ce) AN B=(AUB)N(AUB)N(AUB) @ A=(ANB)U(ANY)
13. a) LetU = {1, 2, 3}, A = {1}, and B = {2}. Then {1, 2} € P(A U B) but
{1,2} P(A) UA(B).
b) XEP(ANB) SX CANBSs X CAand X CB <= X € P(A) and
X EC P(B) = XE XP(A) ON P(B), so P(A N B) = P(A) NPB).
15. a) 2° b) 2"
c) Inthe membership table, A C B if the columns for A, B are such that whenever a | occurs
in the column for A, there is a corresponding | in the column for B.
a) A\|BIC AUB (ANB)U(BNC)
0!10|0 1 1
o0})/o]1 1 l
0 })1 | 0 0 1
0)1 4/1 0 0
110)0 1 1
1/o)/1 1 1
1/1 /0 1 1
1]1 {1 1 1
17. a) AN(B—A)=AN(BNA)=BN(ANA)=BNG=B
b) [AN B)U(AN BNCND)U (ANB) = (ANB) U(ANB) by the Absorption
Law
=(AUVUA)NB=UNB=B
d) AUBU(AN BNC) = (ANB) U[(ANB)NC]=
(ANB) U(AN B)IN[(AN B) UC] =[(ANB) UC] =AUBUC
19, a) [-6,9] ec) 4 e) A, gR
Section 3.3—p. 150
. 55 3. 29 +28 —2° = 736 5, 914 9! — 8! = 685,440
. a) 241424'—22! b) 26! — [2414 24! — 233]
~~]
. (131/(2)3] — 3f12!/(2!)7] + 3(11!/2!) — 10!
Section 3.4—p. 156
. a) 3/8 by) 1/2 cc) 1/4. d) 5/8 e) 5/8—s ff): 7/8 ~~ g) «1/8
. 6 5. a) (8)/(7) =5/22 b) 7/22 7. 49/99
. a) 1/64 b) 3/32. ce) «15/64. dd) 1/2_—se)s ‘11/32 11. a) 55/216 —b) 5/54
ay = 1p) 2/15) 3/35
mk
dh
. Pr(A) = 1/3, Pr(B) = 7/15, Pr(AN B) = 2/15, Pr(A U B) = 2/3; Pr(A U B) = 2/3 =
bot
1/3 +7/15 —2/15 = Pr(A) + Pr(B) — Pr(AMB)
Section 3.5—p. 164
. Pr(A) = 0.6; Pr(B) =0.7; Pr(A U B) = 0.5; Pr(AU B) = 0.5; Pr(AN B) = 0.2;
Pr(A NB) = 0.1; Pr(AU B) = 0.9; Pr(AU B) = 0.8
. a) S={x, y)lx, ye {l,2,3,...,10},% Ay} b) 1/2 oc) 5/9
. 0.4 7. a) 11/21 b) 12/21) 9/21 9. 3/16
11. a) (i) 27/38 (ii) 27/38 ~—b) «() 81/361 — ii) 18/361
13. 11/14 1. (7) /(*) = 330/3, 176,716,400
Solutions §-15
17. Since A U B CY, it follows from the result of the preceding exercise that
Pr(A UB) < Pr(f) =1.S801> Pr(AU B) = Pr(A) + Pr(B) — Pr(AN B), and
Pr(A QB) > Pr(A)+ Pr(B) -—1=0.7405-1= 0.2.
Section 3.6—p. 173 1. 1/4 3. (0.80)(0.75) = 0.60
5. In general, Pr(A UB) = Pr(A) + Pr(B) — Pr(AQ B). Since A, B are independent,
Pr(AN B) = Pr(A)Pr(B). So
Pr(A UB) = Pr(A) + Pr(B) — Pr(A)Pr(B) = Pr(A) +[1 — Pr(A)]Pr(B)
Pr(A) + Pr(A)Pr(B).
The proof for Pr(B) + Pr(B) Pr(A) is similar.
7,a) 52/85 ~~ —b) 11/26 9. 3/7
11. Pr(ANB) = 1/4 = (1/2)(1/2) = Pr(A)Pr(B), so the events A, B are independent.
13. 1/5 15. (0.05)(0.02) = 0.001 17. 5/21
19. Any two of the events are independent. However, Pr(AN BOC) = 1/4 # 1/8 =
(1/2)(1/2)(1/2) = Pr(A)Pr(B)Pr(C), so the events A, B, C are not independent.
21. a) 5/16 ~—ib) «11/32 se) s 11/32 23. 0.6
25. a) 2° — (3) — (7) =26 Ob) 2"°- (3) — (7) = 2" —(n+1) 27. 30/77 29, 0.15
Section 3.7—p. 185
la) 1/4 bye) 7/8 a) 3/4 e) 2/7 f) 1/2
_110
3. a) Pr(X =x) = CHS")
— ,x =0,1,2,3,4,5.
‘ll
b) Pr(X =4)= ‘) = 275/2,268,786
ce) 139/1,134,393 5h 2675/8796
§, a) 2/3 b) 2/3 ce) 1/4 +d) 7/2 e) 35/12
-a)c=1/15 b) 3/5 ec) 7/3-~—s dd) 14/9 9. n = 200, p = 0.35
~]
11. a) (0.75)®§ = 0.100113 ib) (§) (0.25)3 (0.75)° = 0.207642
c) 5°8_, (8)(0.25)* (0.75) = 0.004227
d) 0.037139 (approximately) e) 2 f) 1.5
13. c= 10 15. a) Pr(X = 1) = 1/5; Pr(X = 2) = 16/95; Pr(X = 3) = 12/19
b) 7/19 e¢) 19/35 = d) 231/95 = 2.431579 ee) 5824/9025= 0.645319
17, a) E(X(X —1)) = Dox — DPr(X =x) = Dx — I) Pr(X = x)
x=0 x=2
- n = n!
= ore
s
- »( x pra =e
=| x!(n — x)!
> Dg
_ = nl x on-x — ,2 _ (n — 2)! xX-2 n-x
“LES Dia =i" ~ en D2 aan 0”
= p’n(n — 1) so vin (n2G
_ 2)!
a2) 2 DI pq”vy ,n—-(y+2) Mer aes
°**, — substituting x — 2 =_— y,
y=O ~
== p’n(n _ 0. Il (1—2)
eT — aa— n? y4 (n—2)-3
= pn(n—1)(p+q)"*, _ by the Binomial Theorem
= p?n(n — 1)(1)"? = penn — 1) = 0? p? — np?
b) Var(X) = E(X)? — [E(X)? = [E(X(X — 2D) + ECO] - [EOP =
[(n?p> — np?) + np] — (np)? = n? p? — np? + np — nn? p? = np — np? = np(1 — p) = npg.
S-16 Solutions
19. a) Pr(X =2) = 1/4; Pr(X = 3) = 1/8; Pr(X = 4) = 1/4; Pr(X =5) = 1/4;
Pr(X =6)=1/8 b) 31/8 e) 119/64
21. E(X) =4; 0x =1
Supplementary
Exercises—p. 189 1. Suppose that (A — B) CC andx e A— C. Thenx € A butx ¢ C. Ifx ¢ B, then
[xe AAx¢ B] > x €(A—B)CC.Sonow we have x ¢ C andx € C. This contradiction
givesusx € B,so(A-—C)CB.
Conversely, if (A —C) C B,letye A— B.Theny ¢ A buty ¢ B. Ify ¢ C, then
[ye AAy¢C] > ye (A-—C) CB. This contradiction — that is, y ¢ B and y € B — yields
yeC,so(A—B)CC.
3. a) The sets U = {1, 2, 3}, A = {1, 2}, B = {1}, and C = {2} provide a counterexample.
b) A=ANUW= AN(CUC) =(ANC)U(ANC)
= (ANC) U(A-C)
=(BNC)U(B—-C) =(BNC)U(BNO)=BN(CUC)=BNU=B
a) 126 (if teams wear different uniforms); 63 (if teams are not distinguishable)
Sa
112 (if teams wear different uniforms); 56 (if teams are not distinguishable)
b) 2” — 2; (1/2)(2” — 2). 2” —2 —2n; (1/2)(2" —2—2n),
. a) 128 ~~ b) JA =8
e-~l
. Suppose that (AN B) UC = AN (BUC) and thatx € C. Then
xXxECSXE(ANBUC>SXEAN(BUC)CA,soxe A,andC CA.
Conversely, suppose that C C A.
(1) Ifve (AN B)UC,then ye ANBorvec.
i) yEANBSYE(ANBVU(ANC)SyEAN(BUC).
(ii) ye C> yeEA, because C CA. Also, ye C>yeEBUC.SoyeAN(BUC).
In either case (i) or case (ii), we have yE AN (BUC), so(ANB)UCCAN(BUC).
(2) Nowletze AN(BUC).
Thenze AN(BUC)=(ANB)U(ANC)C(AN
BUC,
since ANC CC,
From parts (1) and (2) it follows that (AM B)UC=AN(BUC).
11. a) [0, 14/3] _b) {0}U(6,12] c¢) [0,+00) d) &@
13. a) A|B ANB Since A C B, consider only rows 1,
2, and 4. For these rows, AN B= A.
0 0 0
0 1 0
1 0 0
1 1 1
Ola) BC} (AnB)u(BnG ANG — ForCC BCA, consider only
7 0 0 0 0 rows 1, 5, 7, and 8. Here
0 0 i 0 0 (AN B)U(BNC)=ANC.
0 1 0 1 0
0 1 1 0 0
1 0 0} 1 1
1 0 1 1 0
1 1 0 1 ]
1 1 l 0 0
Solutions S-17
d) A|BIC AAB AAC BAC When A A B = C, we consider
rows 1, 4, 6, and 7. In these cases,
00 )0;0 | 90 1 00) 01 01 AAC=BandBAC=A.
0o};1 |0 1 0 1
0 | 1 l 1 1 0
1 |0 | 0 ] l 0
1 0 l 1 0 1
I 1/0 0 | 1
l 1 1 0 0 0
15, a) (*') (m<rtl) b) (ET!) kent
17. a) 23. b) 8 19, 7'5 — 3(3'5) +3 24. (7) (2)/(G) = 0.3483
23. a) Dito (7) (6°)= Lede ie
bd (GV Lhe E)GT)] @ (QM) Leo O67]
cit) [(9) + (AV) + OC) + OO) + OO] / Loko GG)]
25. AU B = [-2, 4], AN B = {3} 27. 135/512 = 0.263672
29. Pr(AN(BUC)) = Pr(ANB)U(ANC)) =
Pr(ANB)+ Pr(ANC)— Pr(ANB)N(ANC)). Since A, B, C are independent and
(AN B)N(ANC) =(ANAJN(BAC)=ANBNC, Pr(AN(BUC)) = Pr(A)Pr(B)+
Pr(A)Pr(C) — Pr(A)Pr(B)Pr(C) = Pr(A)[Pr(B) + Pr(C) — Pr(B)Pr(C)] =
Pr(A)[Pr(B) + Pr(C) — Pr(B OC)] = Pr(A)Pr(B UC), so A and B UC are independent.
31. a) 0.99 b) (0.99)? = 0.970299 33. 3/5
35. (3) (0.8)3 (0.2)? + (7) (0.8)4(0.2) + (2) (0.8)° = 0.94208
37. 675 /2048 39. a) c= 1/50 b) 0.82 c¢) 13/41 d) 2.8 e) 1.64
41. a) 3/(%) ») [(")-3]/@) 9 BAM -3/@)
43. 2/[m(m + 1)]
45. a) Pr(X = 1) =7/16; Pr(X = 2) = 3/8; Pr(X = 3) = 3/16
b) 7/4 ¢) ox = 3/4
Chapter 4
Properties of the Integers: Mathematical Induction
Section 4.1—p. 208
1. b) Since 1 - 3 = (1)(2)(9)/6, the result is true for n = 1. Assume the result is true for n =
k(>1):1-342-443-54---+k(K +2) = k(k + 1)(2k + 7)/6. Then consider the case
forn =kK4+1:[1-342-4+---+kK4+2))+ 44+ DK +3) = (kK + DCR +7)/6) +
(k + 1)(k +3) = [(k + 1)/6][K(2k + 7) + 6(k + 3)] = (kK + 1)(2k? + 13k + 18)/6 =
(k + 1)(k + 2)(2k + 9)/6. Hence the result follows for all n € Z* by the Principle of
Mathematical Induction.
l n
©) Sim): dX i@+l) n+l
1
1 1 1
Sq); = —_= —__. 509 S(1) is true.
() DG) 12) 1+1 °° (1)
is true
k k
Assume S(k): Ga
Oy =F _ Consider
S(k + 1).
1=1
k+1
A 1 k 1
aE apt
=] (k + 1)(k +2) ~ k4D &+DE+2
= [k(k +2) + 1/[(K + IK 4+2))= K+ 1/4 4+ 2),
S-18 Solutions
so S(k) => S(k + 1) and the result follows for all n € Z* by the Principle of Mathematical
Induction.
. a) From 0), PF +(nt- 13 = 7H 43P 438 4)N =" P43 P+
350 i=1 i+ 50",
1=0 1, we have
(n +12 =3 9°". 7? 4+35°,_,ii=]
i=) + M41). Consequently,y.
350i? =F + 3n? +30 t 1) —3ly(n + 1)/2]-n-1
l= n3 +4 (3/2)n? + (1/2)n
= (1/2)[2n3 + 3n? +n] = (1/2)n(2n? + 3n + 1)
= (1/2)n(n + 1)(2n + 1), so
S~F_, 2 = (1/6)n(n + 1)(2n + 1) (as shown in Example 4.4).
b) From )0"_, 4+ (n +14 = O_o + Dt = "G4 4+ 408 + 67? 4 47 +:1) =
yy 44 P46 72 4+4 07,1 + 35), 1, it follows that (n + 1)4 =
Ayr, P4677? +450", 8+ 52%, 1. Consequently,
45 8 =(n4 1) — 6[n(n + D(2n + 1)/6] — 4In(@ + 1/2] — m +:1)
lS nt 4 4n3 + 6n? +4n +1 — Qn? + 3n? +n) — (Qn? +2n)-—(n +1)
H=n4wWtr =n(n? +2n+l=an*(n+ 1°.
So )0"_, i = (1/4)n?(n + 1)? [as shown in part (d) of Exercise 1 for this section].
From 37), + (2+ 1° = yg + YD? = O7_ (+ 5i4 + 1003 + 1077 + 58 +1) =
ye P4507 +10 +10 2 4+5007.,14+ 07, 1, we have
57 4 =F 1) — 10/4)n?(n + 1)? — (10/6)n(n + D(2n + 1) — (5/2)n(n +1) —
(n+ 1). So
5 So i* = n° +5n* + 10n* + 10n? +5n +1 — (5/2)n"
il — 5n3 — (5/2)n? — (10/3)n3 — Sn? — (5/3)n — (5/2)n? — (5/2)n —n — 1
= n> + (5/2)n* + (5/3)n> — (1/6)n.
Consequently, )-*_, i* = (1/30)n(n + 1)(6n? + 9n? +n — 1).
- a) 7626 ~~ —b) 627,874 7. n= 10 9. a) 506 b) 12,144
11. a) ie fy = = Goch ~ a1? +i) =2 eis i? + ie t=
2[(n)(@ + In + 1)/6] + [na + 1)/2] = [Inf + 1)Qn + 1)/3) + [nm 4+ 1)/2] =
n(n + DAE + $] = n@ + DE] =n + In + 5)/6.
b) 52) #%; = 100(101)(405)
/6 = 681,750.
c) begin
sum :=0
for i :=1to100 do
sum
:= sum+ (2* i) * (2* i+1)/2
print sum
end
13. a) There are 49 (= 7°) 2 X 2 squares and 36 (= 6’) 3 X 3 squares. In total there are
1742? +37 4---+8? = (8)(8 + 1)(2-8 + 1)/6 = (8)(9)(17) /6 = 204 squares.
b) For each 1 <k <n the n X n chessboard contains (n — k + 1)? k X k squares. In total there
are 1° 4-274 3° 4+.-.-4+r2=n(n+1)Qn+ 1)/6 squares.
15. Forn = 5, 2° = 32 > 25 = 5”. Assume the result for n = k (> 5): 2* > k*. Fork > 3,
k(k—2)> lork?>2k+1%>kRro2M+2% sR +k oats P+ ke > hk? 4+ (2k4+1)
= (k + 1)*. Hence the result is true for n > 5 by the Principle of Mathematical Induction.
Solutions S-19
17. b) Starting with nv = 1 we find that
I
S- JH, = A = 1=[(2)0)/21G/2) — [(2)0)/4] = [(2)C)/212 — (2) 1/41.
j=l
Assuming the truth of the given (open) statement for n = k, we have
k
S> FA, = [kK + (K)/21 Aes — [ke + 100/41.
y=
Forn = k + 1 we now find that
k+l k
> JH, = JH, + (Kk + WI) Mei
j=l i =I
= [(K + 1)(K)/2] Aa — 1K + 1K) /4) + (K+ 1) Ae
= (kK + ILL + (K/2)) Ais — [A + 1K) /4]
= (k + Il + (k/2) [Ase — /(k + 2))1 — 1K + 1)()/4]
= [kK + 2) + 1)/2) Arye — LK + IK + 2)1/[12 + 2)) - [& + )/4]
= [(k + 2)(K + 1)/2) Air — (0/9 [2k + 1) + kK + D1
= [(k + 2)(k + 1)/2] Aisa — [A + 2)(k + 1/41].
Consequently, by the Principle of Mathematical Induction, it follows that the given (open)
statement is true for all n € Z*.
19. Assume S(k). For S(k + 1), we find that }°4*) i = ((k + (1/2)? /2) + (kK +1) =
(ko +k + (1/4) + 2k +. 2)/2 = (kK +1)? + (kK +194 /4)/2 = [& + 1) + C1/2)]?/2. So
S(k) => S(k + 1). However, we have no first value of k where $(k) is true: for all
k>1, 9°48, i = (k)(K +1)/2 and (k)(K + 1/2 = [k + (1/2)? /2 > 0 = 1/4.
21. Let S(n) denote the following (open) statement: For x, n € Z*, if the program reaches the top
of the while loop, after the two loop instructions are executed n (> 0) times, then the value of
the integer variable answer is x(n!).
First consider S(1), the statement for the case where n = 1. Here the program (if it reaches
the top of the while loop) will result in one execution of the while loop: x will be assigned the
value x - 1 = x(1!), and the value of will be decreased to 0. With the value of n equal to 0, the
loop is not processed again and the value of the variable answer is x(1!). Hence S(1) is true.
Now assume the truth for n = k (> 1): For x, k € Z*, if the program reaches the top of the
while loop, then upon exiting the loop, the value of the variable answer is x(k!). To establish
the truth of S(k + 1), if the program reaches the top of the while loop, then the following occur
during the first execution:
The value assigned to the variable x is x(k + 1).
The value ofn is decreased to (kK + 1) -—1=k.
But then we can apply the induction hypothesis to the integers x(k + 1) and k, and upon exiting
the while loop for these values, the value of the variable answer is (x(k + 1))(k!) = x(k + 1)!
Consequently, S() is true for all n > 1, and we have verified the correctness of this program
segment by using the Principle of Mathematical Induction.
23. b) 24=54+54+7+7 25=54+545454+5 26=54+7+747
27=54+54+54+5+7 28 =74+74+74+7
Hence the result is true for all 24 < n < 28. Assume the result true for 24, 25, 26, 27, 28,..., k,
and considern = k + 1. Sincek + 1 > 29, we may writek + 1 = [(k + 1) —5]+5=
(k — 4) +5, where k — 4 can be expressed as a sum of 5’s and 7’s. Hence k + 1 can be
expressed as such a sum and the result follows for all n > 24 by the alternative form of the
Principle of Mathematical Induction.
5-20 Solutions
“ i\< 1\[nm+1)) n41
25. Bo = DxPrnx=0) = ox (2) = (2) v= (5) jar |= 5
* x=1
1
n =(7)de=(7) [|
cay= Terrxea= De (Z) x x=
i 1 n 6
_ @+IQn+1)
6
)j@ntl)— (a+ily =n) [A 18)
Var(X) = E(X°) -[E(X)P _(+
6 4 6 4
~ ely 4n4+2-—(3n+3 )]_ @ty
n+l)m—-1) _ n-1
12 12 12
27. Let T = {n € Z*|n > no and S(n) is false }. Since S(no), S(to + 1), S(to +2), ... , S(m)) are
true, we know that 79, my9 + 1, ny +2,..., n, €T.IfT # G, then T has a least element r,
because T C Z*. However, since S(no), S(#p + 1), .... Sr — 1) are true, it follows that S(r)
is true. Hence 7 = G and the result follows.
Section 4.2—p. 219
1. a) ¢) = 75 Ona) = Cn +7, forn > 1. b) c) = 7: Cn4, = 7c, form > 1.
¢) ¢) = 10; Cha) =e, +3, forn > 1. d) cy = 7s cng) = Cy, forn > 1.
3. Let T(n) denote the following statement: For n € Z*, n > 2, and the statements
P; Gi, G2, ore > Gn>
PV (41 Ng2 A+++ A Gn) (PV G1) A (PV G2) A+ ACP Y Gn):
The statement T (2) is true by virtue of the Distributive Law of v over A. Assuming 7 (k), for
some k > 2, we now examine the situation for the statements p, g), G2, ..- . Gx, Gx41. We find
that
p V (qi Aqz2 A+++
A Gg A Gust)
= PVG AGA Ag) Agri]
= (PV GI AG2 NAGA Y Fev)
= [PV a) A(PV qa) A+++ A(PV GIA (PY Gest)
<> (PVG) APY G2) A+++ AC PY 9k) A (PY Get):
It then follows by the Principle of Mathematical Induction that the statement 7 (n) is true for all
n> 2.
5. a) (i) The intersection of A,, A> is Ay M Ad.
(ii) The intersection of A, A2,..., An, Anyi iS given by Ay M A2M-->N An OM Anas =
(A, NA2N-+>- MA) O Anyi, the intersection of the two sets A) 1 A2M---M A, and A,4).
b) Let S(n) denote the given (open) statement. Then the truth of $(3) follows from the
Associative Law of N. Assuming 5(k) true for some k > 3, consider the case for k + 1 sets.
(1) Ifr = k, then
(A, MA2M---7 Ax) M Aga = A; MA2N+-+ MAM Aga,
from the recursive definition given in part (a).
(2) For 1 <r <k, we have
(A, MN A29-+ ++ OA) 9 (Apa - ++ 7 ARO Ags)
= (A; M1 A2N- +> MA,) O [Ara +++ Ag) Agar]
= [(A, M1 A2N- + ALO Arg 9 OAR Agi
= (A, Ad -= A, Ap) 1 AR) Aga
= Ay MA2M+*- NA, MA pa A NARA Aga,
and by the Principle of Mathematical [nduction, $(m) is true for all > 3 and all l <r <a.
Solutions §-21
7 For n = 2, the truth of the result A M(B, U B,) = (AM B,) U(AN B;) follows by virtue of the
Distributive Law of M over U. Assuming the result for n = k, let us examine the case for the sets
A, B,, Bo, +, Be, Bia. We have AN (B; UB,U---UB, U By )) =ANM[(B, UB, U.-.--
U By) U Bei] = (AN (BU By U- UO BU CAN Best) = LAN BY) U (ANB) U---U
(AN By) U(AN Bey) = (ANB) U(AN BU: -U(AN B,) UCAN B,,)). So the result
is true for all n > 2, by the Principle of Mathematical Induction.
. a) (i) Forn = 2, the expression x;x2 denotes the ordinary product of the real numbers x; and x.
(ii) Letn € Z with n > 2. For the real numbers x,, x2, ... , Xn, Xn41. We define
XjXQ + XyXnyl = (Xx) X2 see Xn )Xn4 ;
the product of the two real numbers x;x2 +> - x, and X,41.
b) The result holds for n = 3 by the Associative Law of Multiplication (for real numbers). So
X1 (423) = (x) x2).x3, and there is no ambiguity in writing x,x2.x3. Assuming the result true for
some k > 3 andall 1 <r <k, let us examine the case for k + | (> 4) real numbers. We find that
(1) ifr =k, then (41x32 +++ x) Xy4) = Xp Xo ++ + X_~X~41 by the recursive definition given in part
(a); and (2) if l <r <k, then (px. +++ xy) (erg KKK) = OK He) (Org HEEL)
= (C0 X20 Xp Vp Ae) Kee = OM AK Kee = KIKD MeN KEKE
so the result is true for all n > 3 and all 1 <r <n, by the Principle of Mathematical Induction.
11. Proof (By the Alternative Form of the Principle of Mathematical Induction): For n = 0, 1, 2 we
have
(n=0) dos. = a) = 1 > (V2)°;
(n=1)) digo = 43 =a) +9 = 2> V2 = (V2)!; and
(n=2) oy) = 44 =a3
+a; =241=3>2= (V2).
Therefore, the result is true for these first three cases, and this gives us the basis step for the
proof.
Next, for some k > 2, we assume the result true for allm =0,1,2,...,k. Whenn =k +1
we find that
Auety42 = Aes = dese tay > (V2) + (VS2? = [029° + V2)?
= 3(S 2)? = (3/2)(2)(/ 2)? = (3/2) (72) = 2)",
because (3/2) = 1.5 > J/2. (= 1.414). This provides the inductive step for the proof.
From the basis and inductive steps it now follows by the alternative form of the Principle of
Mathematical Induction that a,.2 > (/2)" for all n EN.
13. Proof (By Mathematical Induction):
Basis Step: When n = 1 we find that
1
F; 1-1 FP; Fi42
——= F/2-0-=1-(2/2)-1-—-—--1-
» 2! o/ (2/2) 2 21
1=1
so the result holds in the first case.
Inductive Step: Assuming the given (open) statement true for n = k, we have
kK, fot = 1 — “2 When n = k + 1, we find that
F_ 1 -y AF ty F Fe 2 _ F, Fen Fk
2 DetL Ik + Qk+1
i=l
=14 (1/2*)
[FR — 2Fa2) = 1+ 0/24) [CR - Fea2) — Fra]
1+ (1/2**!)[—Fyay — Fagg] = 1 — 0/2")
Pa + Faga) = 1 — (Figs /2**3).
From the basis and inductive steps it follows from the Principle of Mathematical Induction that
Vn Zt DS 0(F 1/2) = 1 ~ Fas2/2").
i=l
§-22 Solutions
15. Proof (By the Alternative Form of the Principle of Mathematical Induction): The result holds
for n = 0 andn = | because
(n=0) 5Fou2 =5h%=50) =5=7-2= Le — Lo = Loss — Lo; and
(n = 1) SFiy4. = 5F3 = 5(2) = 10 = 11 -1= £5 -— Lh, = Lig — Ly.
This establishes the basis step for the proof.
Next we assume the induction hypothesis — that is, for some k (> 1), S5Fy42 = Lng — Ly
for alln =0,1,2,...,k —1,k. It then follows that forn =k + 1,
SFecgiy42 = SFea3 = SCP ea. + Pea) = SCPa42 + Fae-ty42) = SPege + F142
= (Lisa — La) + (Lea -tyg4 — bee) = (hiesa — Be) + (hea — Li-1)
= (Liga + Lesa) — (ha + bx) = Leas — Leg = Lesijsa — Levi,
where we have used the recursive definitions of the Fibonacci numbers and the Lucas numbers
to establish the second and eighth equalities.
In then follows by the alternative form of the Principle of Mathematical Induction that
VneéeN 5 Fna2 = Lnsa — Ly.
17. a) Steps Reasons
1) p. gr. T Part (1) of the definition
2) (pV q) Step (1) and part (2-ii) of the definition
3) (—r) Step (1) part (2-i) of the definition
4) (TT) A (-1r)) Steps (1) and (3) and part (2-iii) of the definition
5) (pv q) > (% A (77))) Steps (2) and (4) and part (2-iv) of the definition
19. a) (4) + (85') = [kK -— 1/214 ke 4+ Dk/2) = (Pk +R +K)/2 = R?.
ce) §) +463") + (697) = [kk — DK — 2)/6] + 41K + DK) — 1/6] + [k +2)-
(k + 1)(k)/6] = (k/O[(k — Ik — 2) +4 + Dk - D+ 420k + 12] = (k/6) [6k] =.
ey = (+E G')1 + EG) + (9)
In general, k’ = }°'=) a,,,(* 7"), where the a,,,’s are the Eulerian numbers of Example 4.21
(The given summation formula is known as Worpitzky’s identity.)
Section 4.3—p. 230
. e) Ifa|x anda|y, thenx = ac and y = ad for some c, d € Z.Soz = x — y = a(c
— d), and
a|z. The proofs for the other cases are similar.
g) Follows from part (f) by the Principle of Mathematical Induction.
. Since g is prime, its only positive divisors are | and g. With p a prime, it follows that p > 1.
Hence p|g > p=4q.
. Proof (By the Contrapositive): Suppose that a|b or a|c. If a|b, then ak = b for some k € Z. But
ak = b= (ak)e = a(ke) = bc => abc. A similar result is obtained if a|c.
. a) Leta = 1, b =5,c = 2. Another example is a = b = 5,c =3.
b) Proof: 31\(5a + 7b + 11c) = 31|(10a + 14b + 22c). Also, 31|(31a + 316 + 31c), so
31|[Gla + 31b + 31c) — (10a + 14b + 22c)]. Hence 31|(21a + 17b + 9c).
. [bla and b|(a + 2)] = b\[ax + (a + 2)y] for all x, y € Z. Letx = —1, y = 1. Then’ > 0 and
b|2, sob = 1 or 2.
il. Let aq = 2m + 1 andb = 2n + 1, for some m,n EN. Thena? + 6? =4(m? +m +n? +n) 42,
so 2|(a? + b*) but4 f(a? +b’).
13. For n = 0 we have 7” — 4” = 7° — 4° = 1 — 1 = 0, and 30. So the result is true for this first
case. Assuming the truth for n = k (> 0), we have 3|(7* — 4*). Turning to the case for
n =k +1, we find that 7**! — 44+! = 7(7*) — 4(4*) = 3 4+. 4)(7%*) — 4(4) = 307) +
4(7* — 44). Since 3|3 and 3|(7* — 4*) (by the induction hypothesis), it follows from part (f ) of
Theorem 4.3 that 3|[3(7*) + 4(7* — 4*)], that is, 3](7*+! — 4**'), It now follows by the
Principle of Mathematical Induction that 3|(7" — 4”) for alla €N.
Solutions §-23
15. Base 10 Base 2 Base 16
a) 22 10110 16
b) 527 1000001111 20F
ce) 1234 10011010010 4D2
d) 6923 1101100001011 1BOB
17. Base 2 Base 10 Base 16 19. n = 1,2, 3, 6,9, 18
a) 11001110 206 CE
b) 00110001 49 31
c) 11110000 240 FO
d) 01010111 87 57
21. Largest Integer Smallest Integer
a) 7=2-1 —8 = —(23)
b) 127=2'-1 —128 = —(2’)
c) 215 —] —(2)5)
d) 931 _ y —(23!)
e) gr-l —] —(2"-!)
23. ax = ay > ax —ay =0=> a(x — y) = 0. In the system of integers, if b, c € Z and bce = 0,
then b = 0 or c = 0. Since a(x — y) = O anda # 0, it follows that (x — y) = Oandx = y.
29. a) Since 2|10' for all t € Z*, 2\n if and only if 2|ro.
b) Follows from the fact that 4|10’ for ¢ > 2.
c) Follows from the fact that 8|10' for t > 3. In general,
2'*"\n if and only if 2'*'|(r, - 10° +---+r,-10+ 79).
Section 4.4-p. 236
. a) gcd(1820, 231) = 7 = 1820(8) + 231(—63)
b) gced(2597, 1369) = 1 = 2597(534) + 1369(—1013)
c) gcd(4001, 2689) = 1 = 4001(—1117) + 2689(1662)
. gcd(a,b) =d>d =ax 4+ by, forsome x, ye Z
gcd(a, b) =d=>a/d,b/deZ
1 = (a/d)x + (b/d)y = ged(a/d, b/d) = 1.
. Proof: Since c = gced(a, b) we have a = cx, b = cy for some x, y € Z*. So ab = (ex)(cy) =
c?(xy), and c” divides ab.
. Let ged(a, b) = h and gcd(b, d) = g.
gcd(a, b) =h => [h|a andh|b] > h|(a- 14+ bce) > hid.
[A|b and h|d] > hAlg.
acd(b, d) = g => [g|b and g|d] => g\(d-1+b(—c)) = gla.
[elb, gla, andh = gced(a, b)] > glh.hlg, g|h, withhe, he Z >ag=h.
. a) Ifc € Z*, then c = ged(a, b) if (and only if)
(1) cla and c|b; and
(2)VdeEZ [(dla) A (d|b)]> adc.
b) Ifc € Z*, then c # gced(a, b) if (and only if)
(l)cfaorcy
b; or
(2) ad € Z [(dla) A (d|b) A (df ©)).
11. gcd(a, b) = 1 > ax + by = 1, for some x, y € Z. Then acx + bey = c. alacx, albcy (because
albc) > alc.
13. We find that for any n € Z*, (5n + 3)(7) + (7n + 4)(—S) = (352 +21) — (35n + 20) = 1.
Consequently, it follows that gcd(5n + 3, 7n + 4) = 1, or Sn +3 and 7n + 4 are relatively
prime.
15. One $20 and 20 $50 chips; six $20 and 18 $50 chips; eleven $20 and 16 $50 chips.
17, There is no solution for c # 12, 18. For c = 12, the solutions are x = 118 — 165k, y = —10+
14k, k € Z. For c = 18, the solutions are x = 177 — 165k, y = —15 4+ 14k, k €Z.
19, b = 40,425 21. ged(n,n +1) = 1; Iem(n,n+ 1) = n(n +1)
§-24 Solutions
Section 4.5—p. 240
a) 27.3°.53-11 ~~ b) 24.3-57- 7-11? ge) 37-59-7113
a) m? = pi"! ps? ps? pr b) m? = py! ps? pi? - -- pi
ee
(The proof is similar to that given in Example 4.41.) If not, we have ./p = a/b, where
a, be Z* and ged(a, b) = 1. Then /p = a/b => p =a’? /b’ => pb’ =a’ = pla* => pla (by
Lemma 4.2). Since p|a we know thata = pk for some k € Z*, and pb* = a? = (pk)? = p*k?,
or b? = pk*. Hence p|b? and so p|b. But if pla and p|h, then gcd(a, b) > p > 1—
contradicting our earlier claim that gcd(a, b) = 1.
a) 96 b) 270 c) 144 9, 660 11. There are 252 possible values for n.
13. a) Proof: (i) Since 10|a? we have 5|a* and 2|a?. Then by Lemma 4.2 it follows that 5|a and
2\a. Soa = 5b for some b € Z*. Further, since 2|5b we have 2|5 or 2|b (by Lemma 4.2).
Consequently, a = 5b = 5(2c) = 10c, and 10 divides a.
(ii) This result is false —let a = 2,
b) We can generalize section (i) of part (a) by replacing 10 by an integer n of the form
Pip2-+++ P,, a product of ¢ distinct primes. (So 7 is a square-free integer— that is, no square
greater than 1 divides n.)
15, 176,400 17. n=2-3-5°- 7? = 7350
19, a)5 b)7 oc) 32.) d)74+74+54+254+20420=84 e) 84
21. 1061 (= 512 + 256 + 293)
23. a) From the Fundamental Theorem of Arithmetic 88,200 = 23 . 3? . 5? - 77, Consider the set
F = {23, 3°, 5°, 7°}. Each subset of F determines a factorization ab where gcd(a, b) = 1.
There are 2* subsets — hence, 2* factorizations. Since order is not relevant, this number (of
factorizations) reduces to (1/2)2* = 23. And since 1 <a <n, 1 <b <n, we remove the case
for the empty subset of F (or the subset F itself). This yields 2* — 1 such factorizations.
b) Here n = 23 - 3° -5?.7*- 11 and there are 2* — 1 such factorizations.
c) Suppose that n = pj! p;°--- p;*, where p;, p2,..., px are k distinct primes and
Ni, M>,..., My, > 1. The number of unordered factorizations of n as ab, where
l<a<n,1<b<«<n,and ged(a, b) = 1, is2*-! — 1.
25. Proof : (By Mathematical Induction): For n = 2 we find that
ro (1-4) = (1-4) = (1— 4) = 3/4 = 2+ 1)/(2 - 2), so the result is true in this first
case, and this establishes the basis step for our inductive proof. Next we assume the result true
for some k € Z* where k > 2. This gives us [It (1- *) = (k + 1)/(2k). When we consider
(2) (N(-2) Oat
the case for n = k + 1, we obtain the inductive step for we find that
1 k+1 k+1)?-1
= [ik + )/28)] 1 a
(k +1)?
|=| 2k+ ae
(k +1)?
_ Qk
Kt (k+2k
+1) = (k+2)/2(kK
+ 2)/(2(K+ +1) ((K=(K4+1 1)/Q2(k + 1)).
+ 1) + 1)/@2¢k +1
The result now follows for all positive integers n > 2 by the Principle of Mathematical
Induction.
27. a) The positive divisors of 28 are 1, 2, 4, 7, 14, and 28, and 1+2+4+7414+428 = 56=
2(28), so 28 is a perfect integer. The positive divisors of 496 are 1, 2, 4, 8, 16, 31, 62, 124, 248,
and 496, andl] +2+4+8+4 16+31 +4624 124+ 248 + 496 = 992 = 2(496), so 496 is a
perfect integer.
b) It follows from the Fundamental Theorem of Arithmetic that the divisors of 2”°-1(2” — 1),
for 2” — 1 prime, are 1, 2, 2*, 23,..., 2-!, and (2 — 1), 2(2" — 1), 2?(2" — 1),
23(2" —1),..., and 2”~'(2” — 1). These divisors sum to [1 + 2 +2? +2? 4.---+2"-!] 4
(2" —1)[1+2+4+2°4+2?4---4+2"™'] = (2"-14+@2"-DQ"-)=
(2” — 1)[1 + 2” —1)] = 272" -—1) = 2(2"' 2" — 1)], so 2”~!(2” — 1) is a perfect integer.
Solutions §-25
Supplementary
Exercises —p. 245 1l.a4+(a+d)+(a4+2d)+---+(a4+(n—1)d) =na + [(a — Iad)/2. Forn = 1,a =
a +0, and the result is true in this case. Assuming that
k
> [a+ i — ld] = ka + [(k — lkd]/2,
i=]
we have
k+l
Sila + @ = Dd] = (ka + (= Ikad]/2) + (a + kd) = (K+ Va F [k(k + D)dd]/2,
i=]
so the result follows for all n € Z* by the Principle of Mathematical Induction.
» Conjecture: ¥°"_,(-V'*1i? = (-1)"*! 0"_, i, for alln € Zt.
Proof (By the Principle of Mathematical Induction): If » = 1 the conjecture provides
hey? = (-)F dy? = 1 = (2!) = (HD! 3921, i, which is a true
statement. And this establishes the basis step of the proof. To confirm the inductive step, we
shall assume the truth of the result
k k
Yep? — (—1)**! Si
7=] :=1
for some k > 1. Whenn = k + 1 we find that
k+1 k k
Lev? _ (Sen) + (-1)&tD+1 (K + 1? _ (-1)*+! S- i+ (—1)**? (k 4 1)?
i=l 7=1 i=]
= (1) 1 (AYR + 1/2 + (HPP (K+: 1)? = (HD +: 1)? — CK + 1/2]
= (-1)7(1/2)[2(k + 1)? — k(K + 1) = (H 1)? (1/2) 2k? + 4k + 2 — k? — ki]
= (—1)°7(1/2)[k? + 3k + 2] = (- DP (1/2) (k + DK +2)
k+]
= (—1)**? > i,
1=1
so the truth of the result at 7 = k implies the truth at n = k + 1 —and we have the inductive
step. It then follows by the Principle of Mathematical Induction that
epee _ (—1)""! ~ i,
r=1 i=l
for alln e Zt.
.a)n n+n+41 n n?>+n+41 n n+n+4i1
1 43 4 61 7 97
2 47 5 71 8 113
3 53 6 83 9 131
b) Forn = 39, n? +n-+41 = 1601, a prime. But forn = 40, n? +n +41 = (41)’, so
S(39) # S(40).
. a) Forn = 0, 27+! + 1 =24 1 =3, so the result is true in this first case. Assuming that 3
divides 27*+! +. 1 forn = k (> 0) EN, consider the case of n = k + 1. Since 274+9+!1 4.1] =
2743 4 | = 4(27K+') 4 | = 4274+! 4 1) — 3, and 3 divides both 2+! + 1 and 3, it follows
that 3 divides 27+"! + 1, Consequently, the result is true for n = k + 1 whenever it is true for
n = k. So by the Principle of Mathematical Induction, the result follows for all n €N.
9 x =y=z=Oandx =2,y=5,z=5
$-26 Solutions
11. For n = 2 we find that 2” = 4 < 6 = (3) < 16 = 4’, so the (open) statement is true in this first
case. Assuming the result true form = k > 2 —that is, 2* < GY) < 4*, we now consider what
happens for n = k + |. Here we find that
2k+1)\ _ (2k +2\ _ [2k +2)QK+1)) (2k) _ 2k
Ty )-Ca)-[ (kK+DK+) 1@) 210k + D/G+ DIC)
> 2[(2k + 1)/(k + I)]2k > 2**1,
since (2k + 1)/(k +1) = [((K +1) +k]/(kK + 1) > 1. In addition, [(k + 1) +k]/(k +1) < 2, s0
Cee7) = 202k + D/(k + DICE) < (2)(2) C4) < 4**!. Consequently, the result is true for all
n > 2, by the Principle of Mathematical Induction.
13. First we observe that the result is true for all n € Z* where 64 <n < 68. This follows from the
calculations
64 = 2(17) + 6(5) 65=13(5) 66 = 3(17) +35)
67=1(17)+10(5) 68 = 4(17)
Now assume the result is true for all n where 68 <n < k, and consider the integer k + 1. Then
k+1= (k —4) +5, and since 64 < k —4 <k, wecan write k — 4 = a(17) + b(5) for some
a, b €N. Consequently, k + 1 = a(17) + (b + 1)(5), and the result follows for all n > 64, by
the alternative form of the Principle of Mathematical Induction.
15, a)r =rotr,-10+m-10°+---+r,- 10"
=rt+rj(9 +r, 472199 tm+---4+r, (99...9)N4+nr%
—
n 9's
= [97 $9972 + e+ +99- + ral + (ro tr Er
be + In)
Hence 9|r if and only if 9|(% +7; +ro+---4+7,).
c) 3\¢ forx = 1 or4 or7; 9|t forx = 7.
17.
a) (5) by (4)
19. a) 1,4,9 b) 1,4,9, 16,..., k, where k is the largest square less than or equal to n.
21. a) Forallne Zt, n>3,142434---+n=n(n41)/2. If {1, 2,3,..., n}= AUB with
Sa = Sg, then 2s, = n(n + 1)/2, or 4s4 = n(n 4+ 1). Since 4|n(n + 1) and ged(n, n +1) = 1,
then either 4|” or 4\(n + 1).
b) Here we are verifying the converse of our result in part (a).
(i) If 4|n, we write n = 4k. Here we have
{1,2,3,...,k,k41,..., 3k, 3k+1,..., 4k} = AUB where A = {1, 2, 3,...,k,
3k+1,3k+2,...,4k —1, 4k} and B = {k+1,k +2,..., 2k, 2k +1, 3k — 1, 3k},
with sg = (1+2+3+---+k)+[GK +1) + 3K +2) 4+---+G3kK4+4)] =
[K(k + 1)/2] +k (3k) + [k(k + 1)/2] = kk +1) + 3k? = 4k? +k, and
Sp=[K+1)4+(K4+2)4+---+(K+4))4+ (02k + 1) 4+ (2k +2) 4+---4+ (2K +4)]
= k(k) + [k(k + 1)/2] + k(2k) + [K(k 4+ 1)/2] = 3k? +k(K +1) = 4k? +k.
(ii) Now we consider the case where n + 1 = 4k. Then n = 4k — 1 and we have
{1,2,3,...,k —l,k,...,3k —1, 3k,..., 4k —2,4k —1} = AUB, with
A={1,2,3,...,k —1, 3k, 3k +1,...,4k — 1} and
B={k,k+1,...,2k —1, 2k, 2k+1,...,3k — 1}. Here we find
Sa =(14+2434---4+ -—D)] 4134 + Gk4+D4+---4+ BkK4+(k-1))] =
[(k — 1)(k)/2] + k(3k) + [(k — 1)(k)/2] = 3k? + kh? — k = 4k? — k, and
Sp=[K+R+I)+---++&—-—D)14+2K4+ 2K +1) 4+---+ 02k + —1))]
=k? + [(k — 1)(k)/2] + k(2k) + [(k — 1) (k)/2] = 3k? + (k= Ik = 4k? — k.
23. a) The result is true fora = 1, so considera > 1. From the Fundamental Theorem of Arithmetic
we can write a = p\'p;’--- p;', where pi, p2...., p; are f distinct primes and e, > 0, for all
1 <i <1. Since a?|b? it follows that p>“ |b? for all 1 <i <1. Sob? = pi" pi”... p7fec?,
where f, > e, forall] <i <r,andb=p!'p®... piic= a(pi'* py... pj “')c, where
f, —e > 0 for all 1 <i <+. Consequently, a|b.
b) This result is not necessarily true! Let a = 8 and b = 4. Then a? (= 64) divides b’ (= 64),
but a does not divide b.
Solutions §-27
25. a) Recall that
a+b =(a+b)(a* —ab +b’)
a+b =(a+b)(a* -—ab+a*h’? — ab’ +b’)
a? + bP = (a +b)(a?-! — a? 2b +--+. + bP)
P
= (a+b) ) oa?" (-by,
i=]
for p an odd prime.
Since k is not a power of2 we write k = r - p, where p is an odd prime and r > |. Then
a* + b= (a")? + (b')? = (a +b) YO" at (—b'),, so a* + b* is composite.
b) Here x is not a power of 2. If, in addition, 7 is not prime, then n = r - p where p is an odd
prime. Then 2? +1 = 2" 4+ 1" =27-P 417? = (27 41) OP rnp =
(27° +1) 92?) (- 17 !2"?-, so 2” + 1 is composite
— not prime.
27. Proof: For n = 0 we find that Fy = 0 < 1 = (5/3)°, and forn = 1 we have F; = 1 < (5/3) =
(5/3)'. Consequently, the given property is true in these first two cases (and this provides the
basis step of the proof).
Assuming that this property is true forn = 0,1,2,...,k —1,k, where k > 1, we now
examine what happens at n = k + 1. Here we find that
Frat = Fe + Fra < (5/3) + (5/3)*"! = (5/3)! [(5/3) + 1] = (5/3) 18/3)
= (5/3)5"' (24/9) < (5/3)*1(25/9) = (5/3) "(5/3)" = 5/3)".
It then follows from the alternative form of the Principle of Mathematical Induction that
F,, < (5/3)" for alln EN.
29, a) There are 9 - 10-10 = 900 such palindromes and their sum is
2 ean ear abcba = 2 er ?_,(10001a + 10104 + 100c) =
Po > F-9110(10001a + 10106) + 100(9 - 10/2)] =
an > p=9(100010a + 10100b + 4500) = ° _,[10(100010a) + 1010009 - 10/2) +
10(4500)] = 1000100 ean a + 9(454500) + 9(45000) = 1000100(9 - 10/2) +
4090500 + 405000 = 49,500,000.
b) begin
sum :=0
for a:=1to9do
for b :=0 to 3 do
forc :=0 to9do
sum := Ssum+10001* a+1010* b+100%*c
print sum
end
31. Proof: Suppose that 7|n. We see that 7|n => 7|(n — 21u) > 7\[(n — u) — 20u] >
7\[10(45*) — 20u] => 7|[10(5* — 2u)] > 7|(44¢ — 24), by Lemma 4.2 since ged(7, 10) = 1.
[Note: 7 € Z* since the units digit of n — u is 0.] Conversely, if 7\(45° — 2u), then since
at — 2u = *2 we find that 7|(4>*) => 7-10-x =n — 21u, for some x € Z*. Since 7|7
and 7|21, it then follows that 7|” —by part (e) of Theorem 4.3.
33. If Catrina’s selection includes any of 0, 2, 4, 6, 8, then at least two of the resulting three-digit
integers will have an even unit’s digit, and be even — hence, not prime. Should her selection
include 5, then two of the resulting three-digit integers will have 5 as their unit’s digit; these
three-digit integers are then divisible by 5 and so, they are not prime. Consequently, to
complete the proof we need to consider the four selections of size 3 that Catrina can make from
{1, 3, 7, 9}. The following provides the selections — each with a three-digit integer that is not
prime.
1) {1, 3, 7}: 713 = 23-31 2) {1, 3, 9}: 913 = 11-83
5-28 Solutions
3) {1, 7, 9}:917=7-
131 4) {3, 7, 9}: 793 = 13 - 61
35. Let x denote the integer Barbara erased. The sum of the integers 1, 2,3,...,x —1l,x +1,
xX +2,..., nis [n(n + 1)/2] — x, so [[n(v + 1)/2] — x]/(@ -— 1) = 3555. Consequently,
[n(n + 1)/2]) —x = (352)(n — 1) = (602/17)(n — 1). Since [n(n + 1)/2] — x € Z*, it
follows that (602/17)(n — 1) € Z*. Therefore, from Lemma 4.2, we find that 17|(n — 1)
because 17/ 602. Forn = 1, 18, 35, 52 we have:
n x =[n(n + 1)/2) — (602/17)( - 1)
l 1
18 —431
35 —574
52 —428
When n = 69, we find that x = 7 [and (3°, i ~ 7)/ 68 = 602/17 = 352].
For n = 69 + 17k, k > 1, we have
xX = [(69 + 17k)(70+ 17k) /2] — (602/17)[68 + 17k]
= 7+ (k/2)[1159 + 289k]
= [7 + (1159k/2)] + (289k?)/2 > n.
Hence the answer is unique: namely, n = 69 and x = 7.
37. (1 +m,)(1+m2)(1 +m), where m, = min{e,, f,} for] <i <3.
Chapter 5
Relations and Functions
Section 5.1—p. 252
1. AX B= {C1, 2), (2, 2), (, 2), (4, 2), C. 5), (2,5), G3, 5), (4, 5)}
BX A= {(2, 1), (2, 2), (2, 3), (2, 4), (5, 1), (5, 2), (5, 3), (5, 9}
AU(B XC) = {1, 2,3, 4, (2, 3), (2, 4), (2, 7), 6. 3), 6, 4, 6, 7)}
(AUB) xC = {(1, 3), (2, 3), GB, 3), 4, 3), 6.3), 1. 9, 2.9. GB. 4, 4 9, 6, 4).
(1, 7), (2, 7), 3, 7), (4,7), 6, 7)} = (A X C)U(B XC)
3a)9 bP PF adr 9 Q H C)+G)+C)
5. a) Assume that A X BCC X Dandleta € Aandbe B. Then (a, b) € A X B, and since
AX BCC X D, we have (a, b) EC X D. But (a, b)eE C X D=aeCandbe
D. Hence
ae€ASDaeEC,SOACC, andbe BSbeED, soBCD.
Conversely, suppose that A C C and B C D, and that (x, y) € A X B. Then
(VIE AXB>xe€Aandye B=>x eC (since ACC) and ye D (since B CD)
=> (x, y) €C X D. Consequently, A X BCC X D.
b) Even if any of the sets A, B, C, D is empty, we still find that
(A SCC)A(BOD)|>
[AX BOCX DI.
However, the converse need not hold. For example, let A = @, B = {1, 2}, C = {1, 2}, and
D = {1}. Then A x B = 4 —if not, there exists an ordered pair (x, y) in A X B, and this
means that the empty set A contains an elementx. Andso A X B =#@CC XK D—but
B={1,2})Z {1} =D.
. a) 2” —b) If |A| =m, |B| =n, form, n EN, then there are 2”” elements in P(A X B).
—
9c) (X, VIE(ANB)XCHexecANBandyeC (xe Aandxe ByandyeCes
(xe AandyeCjand(xe BandyeC)S.yeAXCandtxy, yeBxeces
(x,y)E(AXC)N(BXC)
iL. Gx. ye AX(B-C)SxeAandyeB-Ce>xecAand(ye Bandy€C)<—
(x€Aandye Byand(x ce Aandy~C) Sx, ye AX Bandtix, vyy€AXCS
(x, y)€ (AX B)-(AXC)
13. a) (1) (0, 2) ER; and
(2) If (a. b) ER, then (a+1,b4+5) ER
Solutions §-29
b) From part (1) of the definition we have (0, 2) € &. By part (2) of the definition we then find
that
Gi) (0,2€R3BO04+1,24+5) =C,7 €R;
Gi) 1, 7)€RBIA4+1,74+5) =2, 12 ER:
(iii) (2, 12,E RS (241, 124+ 5) = GB, 17) € RK; and
(iv) 3, 1INER> B41, 17 +5) = 4, 22) ER.
Section 5.2—p. 258
. a) Function; range = {7, 8, 11, 16, 23,...} —_b) Relation, not a function
c) Function; range = R_ d) ande) Relation, not a function
» a) (1) {C, x), @, x), G. x), 4,0) (2) {, y), 2. y), By), Av}
(3) {C, 2), 2, 2), 3.2). A 2) (4) 10, x), 2. y), 3, x), A)
(5) {(. x), (2, y), GB. 2), 4, x}
b) 34 ce) O §=6d) 4 oe) 24S ff) 33s gs) 3?_~—Ss hh) 3?
- a) {1,3)} bb) {(—7/2, —21/2)}
e) ((-8,-15)} d) R? — ((-7/2, ~21/2)} = {(x, y)lx # 7/2 or y # ~21/2}
. a) (23~—1.6]=[0.7)=0 b) [23J—[1.6)=2-1=1
c) [3.4]|6.2]}=4-6=24 d) [3.4)[6.2] =3-7=21
e) [27] =6 f) 2[x]=8
. a) ---U[—1, —6/7) U[0, 1/7) U[], 8/7)
U [2, 15/7) U---
b) [1,8/7) oe) Z DR
11. a) ---U(—7/3, —2] U (-4/3, -—1] U (- 1/3, 0] U (2/3, 1] UV (S/3, 2] U--+- =
U nex — 1/3, m]
b) ---U((—2n — 1)/n, —2] U (-—n — 1)/n, -1) U(—1/n, 0) U (mm — 1)/n, 1] U
((2n —1)/n, 2]U---= Unez(m — 1/n, m]
13. a) Proof (i): If ae Z*, then [a] = a and [[a]/a] = [1] =1l.Ifa¢Z*, writea=n-+c,
wheren € Z* andQ <c < 1, Then [a]/a = (n+ 1)/(# +c) =1+(1—c)/(# +0), where
0< (1 —c)/(n +c) < 1. Hence [[a]/a] = |1+0 -—c)/mst+o)] =1.
b) Consider a = 0.1. Then
(i) [[a]/a] = |0/0.1] = [10] = 10 A 1; and
(ii) [la]/a] = [0/0.1] = 041.
In fact (ii) is false for all O < a < 1, since [|a@]/a] = 0 for all such values of a. In the case of
(i), when 0 < a < 0.5, it follows that [a]/a > 2 and |[a]/a] > 2 # 1. However, for
0.5 <a<1, fa|/a = |/a where 1 < 1/a <2, andso |[a]/a] =1for0.5<a< 1.
15. a) One-to-one; the range is the set of all odd integers.
b) One-to-one; the range is Q.
c) Not one-to-one; the range is {0, + 6, + 24,+ 60,...} = {n> —n|n €Z}.
d) One-to-one; the range is (0, +00).
e) One-to-one; the range is [—1, 1].
f) Not one-to-one; the range is [0, 1].
17, 4?
19, a) f(A, UA2) = {y € Bly = f(x), x € Ay U An} = fy € Bly = f(x), x € A) or x € An} =
{ye Bly = f@), xe A}U {ye Bly = f(x), x © Ar} = f(A1) U f(A2)
¢) From part (b), f(A; 9 Az) © f(A,) M f(A2). Conversely, y € f(A1) N f(A2) > vy =
f(x) = f(x), for x; € Ay, x2 € Ao => y = f (x1) and x, = x2 (because f is injective)
=> ye f(A, 1 Az). So f injective > f(A; N Ar) = f(A,) 9 f(A).
21. No. Let A = {1, 2}, X = {1}, Y = {2}, B = {3}. Forf = {(1, 3), (2, 3)} we have f\x, fly
one-to-one, but f is not one-to-one.
23. a) fq) =12¢-l)+f db) fa@,)=10G-D)+j © f@)=7-D+i
25. a) i) f(a) =nG -D+k-D+j Giga, =mG-D+k-)+i
b) K+(mn—-1) <r
27. a) AQ, 3) = A(O, AC, 2)) = AC, 2) + 1 = A(O, AU, 12) +1 = [Ad, 1) 4+1)4+1=
A(1, 1) +2 = A(O, AC, 0)) +2 = [AC 0) +1] 4+ 2 = ACL, 0) +3 = AO, 1) +3 =
(4+1)4+3=5
S-30 Solutions
A(2, 3) = A(I, A(2, 2))
A(2, 2) = AC, A(2, 1))
A(2, 1) = AC, A(2, 0)) = ACL, AC, 1))
AC, 1) = AQ, AC, 0)) = AM, 0) +1 = AO, 1 4+1=04141=3
A(2, 1) = AC, 3) = A(O, A(1, 2)) = AC, 2) +1 = A(O, ACL, 1)) +1
={AU,D+1]4+1=5
A(2, 2) = A(1, 5) = A(O, AC, 4)) = AC, 4) + 1 = AO, AC, 3)) +1 = ACI, 3) +2
= A(O, AC, 2)) +2 = AC, 2) +3 = AQ, AC, 1) +3 =A, 1 +4=7
A(2, 3) = AQ, 7) = A(O, ACI, 6)) = AC, 6) + 1 = ACO, A, 5)) +1
= A(0,7)4+1=(74+1)4+1=9
b) Since A(1, 0) = A(O, 1) = 2 = 0 +2, the result holds for the case where n = 0. Assuming the
truth of the (open) statement for some k (> 0), we have A(1, k) = k + 2. Then we find that
AQ, k +1) = A(O, AG, k)) = AC, &k) +1 = (K +2) +1 = (+1) +2, so the truth atn =k
implies the truth at n = k + 1. Consequently, A(1, 7) = n + 2 for all n € N by the Principle of
Mathematical Induction.
Section 5.3—p. 265
1. a) A= {1, 2,3, 4}, B= {v, w, x, y, z}, f = {C1 v), (2, v), GB. wy), (4, x)}
b) A, Basin (a), f = {C1 v), 2, x), GB. 2), (4, y)}
c) A={1,2,3,4,5}, B= {w, x,y,z}, f = (d, w), (2, w), (3, x), 4, y), (5, 2)}
d) A= {1, 2,3, 4}, B= {w, x,y,z}, f= {, w), 2, x), (3, y), (4, 2}
3. a), b), c), and f) are one-to-one and onto.
d) Neither one-to-one nor onto; range = [0, +00)
e) Neither one-to-one nor onto; range = [—4, +00)
5. (For the case n = 5, m = 3):
5 5 5 5 5
d( _4)k
1) (,°,)e — py3
ky? == (-1)
¢_1)0
(2)s 3 +(-1
—aylfr
(3)4 )43 + _4)2
(=1) (3)3 3
_43{>)53 —4fP\43 asf? \o3
+(-1) (3) +(-1) (7) +(—1) (30
= 125 — 5(64) + 10(27) — 10(8) +5 = 0
7. a) (i) 2!S(7, 2) (ii) (3)[2!S(7, 2)] (iii) 3!S(7, 3)
(iv) (3)[3!SC, 3)] (v) 4!S(7, 4) (vi) ({)[4!S(7, 4)]
b) ()[KIS(m, k)]
9. For each r € R there is at least one a € R such that a> — 2a* +a — r = 0, because the
polynomial x° — 2x* +x — r has odd degree and real coefficients. Consequently, f is onto.
However, f (0) = 0 = f(1), so f is not one-to-one.
LON] 10203 4 5 6 7 $8 9 10
9 1 255 3025 7770 6951 2646 462 36 ]
10 1 511 9330 34105 42525 22827 5880 750 45 = 1
13. a) Since 156,009 = 3 X 7 X 17 X 19 X 23, it follows that there are $(5, 2) = 15 two-factor
unordered factorizations of 156,009, where each factor is greater than 1.
b) )09_, $(5,7) = 154254+104+1=51 ©) YL, Sam, i)
Solutions S-31
15. ayn=4:>°4 iS(4,1); 2=5: >, 86,7)
In general, the answer is )°"_, i!S(n, i).
b) (3) S012, i812, 2).
17. Let a@|, @2,..., dm, x denote the m + 1 distinct objects. Then S,(m + 1, 2) counts the number
of ways these objects can be distributed among n identical containers so that each container
receives at least r of the objects.
Each of these distributions falls into exactly one of two categories:
(1) The element x is in a container with r or more other objects: Here we start with S,(m, 1)
distributions of a), a2, ..., @, into n identical containers — each container receiving at least r
of the objects. Now we have n distinct containers — distinguished by their contents.
Consequently, there are n choices for locating the object x. As a result, this category provides
nS,(m, n) of the distributions.
(2) The element x is in a container with r — | of the other objects: These other r — 1 objects can
be chosen in (1) ways, and then these objects — along with x —can be placed in one of the n
containers. The remaining m + 1 — r distinct objects can then be distributed among the n — 1
identical containers — where each container receives at least r of the objects — in
S,(m + 1— rr, — 1) ways. Hence this category provides the remaining
(,",)S,Qm + 1 —r, n — 1) distributions.
19. a) We know that s(m, n) counts the number of ways we can place m people —call them
P\. Pi. +++» Pm—around n circular tables, with at least one occupant at each table. These
arrangements fall into two disjoint sets: (1) The arrangements where p, is alone: There are
s(m — 1, n — 1) such arrangements; and (2) The arrangements where p, shares a table with at
least one of the other m — | people: There are s(m — 1, 2) ways where p2, p3,..., Pm Can be
seated around the n tables so that every table is occupied. Each such arrangement determines a
total of m — | locations (at all the n tables) where p; can now be seated— this for a total of
(m — 1)s(Qm — 1, n) arrangements. Consequently, s(m, n) = (m — 1)s(m — 1,2) 4+
s(m—1,n—-—1),form>n> 1.
Section 5.4-p. 272
. Here we find, for example, that f( f(a, 6), c) = f(a, c) = c, while f(a, f(b, c)) =
f(a, b) =a, so f is not associative.
. a), b), and d) are commutative and associative; c) is neither commutative nor associative.
tn Go
.a) 25. by 5% cc) 5% dy 59°
“J
. a) Yes b) Yes c) No 9. a) 1216 b) p?!qg*””
. By the Well-Ordering Principle, A has a least element and this same element is the identity for
mi
—_
g. If A is finite, then A will have a largest element, and this same element will be the identity
for f. If A is infinite, then f cannot have an identity.
13. a) 5 b) A3 Ag As c) A}, Az
25 25 6
25 2 4
60 40 20
25 40 10
Section 5.5—p. 277
1. The pigeons are the socks; the pigeonholes are the colors. 3. 26° + 1 = 677
5. a) Foreachx € {1, 2, 3,..., 300} writex = 2” -m, where n > 0 and gcd(2, m) = 1. There
are 150 possibilities for m: 1, 3,5, ..., 299. When we select 151 numbers from
{1, 2, 3,..., 300}, there must be two numbers of the form x = 2°-m, y = 2'- m.Ifx <y,
then x|y; otherwise y < x and y|x.
b) If + 1 integers are selected from the set {1, 2, 3,..., 2n}, then there must be two integers
x, y in the selection where x|y or y|x.
7. a) Here the pigeons are the integers 1, 2,3, ..., 25 and the pigeonholes are the 13 sets
{1, 25}, {2, 24}, .... {11, 15}, {12, 14}, {13}. In selecting 14 integers, we get the elements in at
least one two-element subset, and these sum to 26,
S-32 Solutions
b) If S ={1,2,3,..., 2n + 1}, for n a positive integer, then any subset of size n + 2 from $
must contain two elements that sum to 2n + 2.
. a) Foreachr é {1, 2,3, ..., 100}, we find that 1 < ./7 < 10. When we select 11 elements
from {1, 2,3,..., 100} there must be two— say, x and y — where |./x] = L,/y] so that
0<|J/x— J/yl <l.
b) Letn € Z*. Ifn + 1 elements are selected from {1, 2, 3, ..., 7}, then there exist
two— say, x and y — where 0 < |./x — //y| < 1.
11. Divide the interior of the square into four smaller congruent squares as shown in the figure.
Each smaller square has diagonal length 1//2. Let region R, be the interior of square AEKH
together with the points on segment EX, excluding point E. Region R; is the interior of square
EBFK together with the points on segment FK, excluding points F and K. Regions R3 and R,
are defined in a similar way. Then if five points are chosen in the interior of square ABCD, at
least two are in R, for some 1 <i < 4, and these points are within 1 / 2 (units) of each other.
E B
7 @---——_e>
|
)
G)
ry
13. Consider the subsets A of S where 1 < |A| < 3. Since |S| = 5, there are (7) + (3) + G) = 25
such subsets A. Let s4 denote the sum of the elements in A. Then 1 < s, <7+8+9 = 24. So
by the pigeonhole principle, there are two subsets of S whose elements yield the same sum.
15, For (6 A)T CS, we have 1 <sp <m+(m-—1)+--+-+(m—6) = 7m — 21. The set S has
2’ — 1 = 128 — 1 = 127 nonempty subsets. So by the pigeonhole principle we need to have
127 > 7m — 21 or 148 > 7m. Hence 7 <m < 21.
17. a) 2,4,1,3 b) 3,6,9,2,5,8,1,4,7
c) For n > 2, there exists a sequence of n? distinct real numbers with no decreasing or
increasing subsequence of length n + 1. For example, consider n, 2n, 3n,..., (n — In,
n?, (n—1), (Qn —1),..., @? — 1), (n — 2), Qn —2),..., * —2),..., 1, +1),
(Q2n4+1),...,(@—1)n+1.
d) The result in Example 5.49 (for n > 2) is best possible —in the sense that we cannot reduce
the length of the sequence from n? + | ton’ and still obtain the desired subsequence of length
n+],
19, Proof: If not, each pigeonhole contains at most & pigeons — for a total of at most kn pigeons.
But we have kn + 1 pigeons. So we have a contradiction and the result then follows.
21, a) 1001 —_—b) 2001
c) Letn, k € Z*. The smallest value for |S| (where S C Z*) so that there exist n elements
Xi, X2,..., X%, € S where all n of these integers have the same remainder upon division by & is
k(n —1) +1.
23. Proof : If not, then the number of pigeons roosting in the first pigeonhole is x; < p, — 1, the
number of pigeons roosting in the second pigeonhole is x. < p)2 — 1,..., and the number
roosting in the mth pigeonhole is x, < p, — 1. Hence the total number of pigeons is
Xi +x te ten = (pi — D+ (pp — 1) +--+ + rn — I= pit Pate + Pr oN <
Pi + pr t++>+ pr —n +1, the number of pigeons we started with. The result now follows
because of this contradiction.
Section 5.6—p. 288
. a) 7!—6!= 4320 b) nt-—(n—-lD!t=™m—-—)Dm-))!
uo —_
.a=3,b=-l:a=-3,b=2
Solutions $-33
5. g°(A) = e(TN(SUA)) =TOA(SU[TN(SUA)))
=TN[SUT)N(SU(SU
A))] = TAL(SUT)N(SUA)]
=[TNO(SUT)IN(SUA)=TNO(S
UA) = g(A)
7. a) (f og)(x) = 3x —1; (go f)(x) = 3(a — 1);
0, x even; _ |0, x even;
women ={} x odd hoon =|t x odd
(f 0(g oh))(x) = f(g oh)(x)) = 1
1, x even;
x odd
_ |(fog)(0), xeven _|-l, x even
(CFosvoMO= Tees nay. x odd -| 2, x odd
b) f2(x) = f(f()) = x — 2; A(x) = x — 3: 27 (x) = 9x: (x) = 27x? = Hh =A.
9. a) f(x) = (1/2)(Inx —5)
b) Forx eR‘,
(f oO f ')@) _— f(/2)dn x —5)) _— e2((1/2) (ln x5) +5 — eittosts5 = eln* =x.
Forx €R,
(fol o fy) = fe) = (1/2) [In(e**) — 5] = (1/2)[2x + 5 — 5] = x.
y
f(x)
(0, e° ) Ve )
> xX
[ 0)
11. f, g invertible = each of f, g is both one-to-one and onto = go f is one-to-one and onto
=> go f invertible. Since (g o f)o(f-!'og') = 1c and(f-!og')o(go f) = Ia, it
follows that f~! o g~! is an inverse of g o f. By uniqueness of inverses, we have
flog'=(gof)".
13. a) f~'(—10) = {—17} f-'(0) = {-7, 5/2}
f'(4) = {-3,1/2,5} fF 1) = {-1,7}
£7") = {0, 8) f-'(8) = {9)
b) (i) [-12, -8] (ii) [-12, -7] U [5/2, 3)
(ii) [-9, -3]U[1/2, 5] Gv) (2, 01U 6, 11)
(v) [12, 18)
S-34 Solutions
15. 3° . 43 = 576 functions
17. a) The range off = {2,3,4,...} =Z* — {1}.
b) Since | is not in the range of f, the function is not onto.
c) Forallx, ye Z*, f(x) = fo >x+1l=y+1>5%x=~y,s0 f is one-to-one.
d) The range of gis Z*. e) Since g(Z*) = Z", the codomain of g, this function is onto.
f) Here g(1) = 1 = g(2), and | # 2, so g is not one-to-one.
g) Forallx eZ", (go f)(x) = g(f(x)) = g(x +1) = max{l, 29 +1) —- =
max{1, x} = x, since x € Z*. Hence g 0 f = Ig+.
h) (f og)(2) = f(max{1, 1}) = f() =14+1=2
(f © g)(3) = f(max{1, 2}) = f2)=2+1=3
(fo g)(4) = f(max{1, 3}) = fG)=3+1=4
(fo g)(7) = f(max{1, 6}) = f(6)=64+1=7
(f og)(12) = f(max{l, 11}) = fal) =114+1= 12
(f o g)(25) = f(max{1, 24}) = f(24) = 244 1=25
i) No, because the functions f, g are not inverses of each other. The calculations in part (h)
may suggest that f o g = lz+, since (f o g)(x) = x forx > 2. But we also find that
(f og)(1) = famax{i, 0}) = fC) = 2,s0 (f o g)(1) # 1, and, consequently, f og # 1z+.
19, a) ae f (BIN Bn) Ss fla € BN Bs = f(a) « Band f(a) € By SB ae f-'(B,) and
ae f'(B) sae f'(B)O f (Bo)
ce) ae fIBJSf@MeBSf@eB Sag f'(B) sae f'(B)
21. a) Suppose that x,;, x. € Zand f(x,) = f(x). Then either f(x,), f(x2) are both even or they
are both odd. If they are both even, then f(x) = f(x?) = —2x, = —2x2 => x) = Xp.
Otherwise, f(x,), f(x2) are both odd and f(x,) = f(x2) > 2x, — 1 = 2x, —1 > 2x, =
2x7 => xX; = X2. Consequently, the function f is one-to-one.
To prove that f is an onto function, let n € N. Ifn is even, then (—n/2) € Z and (—n/2) <0,
and f(—n/2) = —2(—n/2) =n. For the case where n is odd we find that (n + 1)/2 € Z and
(n + 1)/2 > 0, and f((m + 1)/2) = 2[(7 + 1)/2]-— 1 = (n+ 1) -—1 =n. Hence f is onto.
b) f—':N— Z, where
r= {Oe
-t¢y -[G)@+D.
23. a) Foralln EN, (g0 f)(n) = (ho f)(n) = (ko f)(n) = 1.
b) The results in part (a) do not contradict Theorem 5.7. For although
gof=hof=kof = In, we note that
(i) (f og)() = f(L1/3)) = £0) =3-0=0 1,80 fog F In;
(i) (f oh)(1) = f((2/3)) = FO) =3-0=0 £ 1,80 f oh # Ly; and
(Gil) (f ok)C) = f(13/3})) = fC) =3-1=3 4 1,so fok # In.
Consequently, none of g, #, and k is the inverse of f. (After all, since f is not onto, it is not
invertible.)
Section 5.7—p. 293
-a) feO(m) b) feO) oo) fed) dd feo’)
e) feO(n’) £f) fed’) g) fe Om)
. a) Foralln € Z*, 0 <log,n <n. Soletk = 1 and m = 200 in Definition 5.23. Then
| f (n)| = 100 log, n = 200 (5 log, n) < 200 ($n) = 200|g(n)|, so f € O(g).
b) Forn = 6, 2" = 64 < 3096 = 4096 — 1000 = 2'* — 1000 = 2" — 1000. Assuming that
2‘ < 2%* — 1000 forn = k > 6, we find that 2 < 2? => 2(2*) < 2?(2%* — 1000) < 272% — 1000,
or 24*! < 274+) — 1000, so f(n) < g(n) for all n > 6. Therefore, with k = 6 and m = 1 in
Definition 5.23, we find that for n > k, | f(n)| < m|g(n)| and f € O(g).
. To show that f € O(g), letk = 1 and m = 4 in Definition 5.23. Then for all n > k, | f(n)| =
mta<n?+n® =2n? <2n3 = 4((1/2)n) = 4|g(n)|, and f is dominated by g. To show that
g ¢ O(f), we follow the idea given in Example 5.66, namely, that
VmeR* VkeEZ AneZ [m=k)
a (\g(n)| > mf (~)))).
Solutions §-35
So no matter what the values of m and k are, choose n > max{4m, k}. Then
Ig(n)| = (4) nF > (5) 4m)n? = mn’) > mr? +n) = mf (n)|. so g ¢ Of). Alternatively,
ifg € O(f), then dm e R* SkEZ* VneZ* |(3) n°| <m|n? +n], or (5) nn? <m@t I).
Then =, <m>0< ae < — <m=>5<m,a contradiction since is variable and m
constant.
. Foralln > 1, log, n <n, so with k = 1 and m = | in Definition 5.23, we have |g(n)| =
log,n<n=m-n=m|f(n)|. Hence g € O(f). To show that f ¢ O(g), we first observe that
liMy-+ 20 logy = +oo. (This can be established by using L’ Hospital’s Rule from the calculus.)
Since lim, _.o3 on = --oo, we find that for every m € R* andk € Z*, there in ann € Z* such
that —"— > m, or|f(n)| =n > mlog, n = m|g(n)|. Hence f ¢ O(g).
logy n
. Since f € O(g), there exists m € R*, k € Z* such that | f(7)| < m|g(n)| for all n > k. But then
| f(n)| < [m/|cl]|cg(n)| for all n > k, so f € O(cg).
11. a) Foralln > 1, f(n) = 5n?+3n > n? = g(n). So with M = 1 andk = 1, we have
| f(n)| => M|g(n)| for all n > & and it follows that f € &(g).
c) Foralln > 1, f(n) =5n?+3n>n=h(n). With M = 1 andk = 1, we have | f ()| >
M\h(n)| for alln > k andso f € Q(A).
d) Suppose that h € Q(f). If so, there exist M € R* andk € Z* withn = |A(n)| >
M|f (n)| = M(5n? + 3n) for alln > k. ThenO < M <n/(5n? + 3n) =
1/(5n + 3). But how can M be a positive constant while 1/(5n + 3) approaches 0 as n
(a variable) gets larger? From this contradiction it follows that h ¢ Q(f).
13. a) Forn > 1, f(n) = 0"_, i =n(n + 1)/2 = (n?/2) + (n/2) > (n?/2). Withk = 1 and
M = 1/2, we have |f(n)| > M|n?| for alln > k. Hence f € Q(n?).
b) PHP 42 4---49? > [n/2]+--- +n? > [n/2}?4+---+ [n/2]? =
[(n + 1)/2][n/2]? > n3/8. Withk = 1 and M = 1/8, we have |g(n)| > M|n3| for all n > k.
Hence g € Q(n?).
Alternatively, forn > 1, g(n) = ye i? =n(n+1)(2n + 1)/6 = Qn? +.3n? +n)/6 >
n> /6. Withk = 1 and M = 1/6, we find that |g(n)| > M|n3| for all n > k—sog € Q(n’).
ce) ON HU 42 4--- tn > [nf/2)i+---4n' > [n/2\'+---4+ [n/2]' =
[(n + 1)/2] [n/2]! > (n/2)'t!. Withk = 1 and M = (1/2)'*!, we have |h(n)| > M|n'*"| for
alln > k. Hence h € Q(n'*!).
15. Proof: f € O(g) > f € Q(g) and f € O(g) (from Exercise 14 of this section) => g € O(f)
and g € Q'(f) (from Exercise 12 of this section) > g € O(f).
Section 5.8—p. 300
-a) feO(n’) b) feOn’) c) fe Om) d) f € O(log, n)
e) f € O(n log, n)
. a) Here there are five additions and 10 multiplications.
b) For the general case there are n additions and 2n multiplications.
. Forn = 1, we find that a; = 0 = [0] = [log, 1], so the result is true in this first case. Now
assume the result true for all nm = 1,2, 3,...,&, where k > 1, and consider the cases for
na=k+l1.
(i) n=k+1=2", wherem eZ: Here a, = 14+ inj.) = L+aym-1 =
1 + [log, 2"-1) =1+4(m— 1) =m = |log, 2”| = [log n]; and
(ii) n=k+1=2"+4r, wherem € Z* andO <r <2”: Here 2” <n <2”+!, so we have
(1) 27! < (n/2) <2”:
(2) 2"-! = [2"-!) < |[n/2| < [2"] = 2”; and
(3)m — 1 = log, 2”~! < log,|[n/2] < log, 2” = m.
Consequently, |log,|n/2]] =m — landa, = 1+ 4j,/;2; = 1+ |log,|n/2]] =
1+(m — 1) =m = [log, n|. Therefore it follows from the alternative form of the Principle of
Mathematical Induction that a, = |log, n} for alln € Z*.
» (5/8)n + (3/8)
$-36 Solutions
11. a) procedure LocateRepeat (n: positive integer;
A), a, a3,...,a,: integers)
begin
location :=0
i:=2
while j < nand location = 0 do
begin
j:=1
while j < iand location = 0 do
if a, = a, then location :=1i
else j:=j+1
i:=i+i
end
end {location is the subscript of the first array entry that
repeats a previous array entry; locationis 0 if the array
contains ndistinct integers. }
b) O(n’)
Supplementary
Exercises —p. 305 . a) If either A or B is 4, then A X B = § = AM B and the result is true. For A, B nonempty we
find that:
(x, y)E (A X B)N(BX ADS (yy) EA X Band (x, y)e€BXASD (EA and ye B) and
(x€BandyeA)=sxeANBandyeAnB=s (x, y)€ (ANB) X (ANB); and
(x, y)E (ANB) X (ANB) => (x eEAandxe Byand(ye Aandye B)S (x, v)EAXB
and (x, y)€E BX A=> (x, y) €(A X B) ON(B XA).
Consequently, (A X B) M(B X A) = (ANB) X (ANB).
b) If either A or B is @, then A X B = @= B X A and the result follows. If not, let
(x, vy) € (A X B) U(B X A). Then
(x, y)E(AXB)U(BX ASC, y)E AX Bors, y) € (BX A) => (x € Aand y € B) or
(«Ee BandyeA)SWwecAorxe BoandGVycAorve B)Sx,yEeAUBSs
(x, vy) € (AUB)
X (AUB).
-a) f(D= fd- D=1- fO)+1-fC),so
fy =0. by f(0)=0
c) Proof (by Mathematical Induction): When a = 0 the result is true, so consider a # 0. For
n=1, f(a") = f(a) =1-a°- f(a) = na"
f (a), so the result follows in this first case, and
this establishes our basis step. Assume the result true for n = k (= 1) — that is,
f(a’) = kak“! f(a). Forn =k +1 we have f(a‘t!) = f(a- a’) =af(a*) +a‘ f(a) =
aka‘! f(a) +a‘ f (a) = ka’ f(a) +a f (a) = (k + l)a*
f (a). Consequently, the truth of the
result for 7 = k + 1 follows from the truth of the result for n = k. So by the Principle of
Mathematical Induction the result is true for alln € Z*.
. (x, y) € (ANB) X (COND) Sx E ANB, yECNDS (xe A, y eC) and
@EB ye DSa,yeAXCand(x,yeBX
DS (x, y)E(A XK C)N(BX D)
.x=1//2andx = /3/2
. b) Conjecture: Forn € Z*, f"(x) = a"(x +b) — b. Proof (by Mathematical Induction): The
formula is true for n = 1 —by the definition of f(x). Hence we have our basis step. Assume the
formula true for n = k (> 1)—that is, f*(x) = a*(x + b) — b. Now consider n = k +.1. We
find that f**'(x) = f(f*(x)) = f@*(e +6) — b) = ala’ +b) — b) +b] -b =
a‘+'(x + b) — b. Since the truth of the formula at n = k implies the truth of the formula at
n = k +1, it follows that the formula is valid for all n € Z* — by the Principle of Mathematical
Induction.
11. a) (7!)/[2(7°)]
13. For | <i < 10, let x, be the number of letters typed on day 7. Then
Xp +X. +43 +--+ + xg + X9 + X19 = 84, Or x3 +--+ + xg = 54. Suppose that
Solutions S-37
xy txt x3 < 25,x +43 4x4 < 25,...,xXg 1X9 + X19 < 25. Then
xX) + 2x. + 3x3 +--+ + xg) + 2x9 + X19 < 8(25) = 200, or 3(x3 + +--+ xg) < 160.
Consequently, we obtain the contradiction 54 = x3 +---+ xg < uw = 534.
15. For | [j-(& — ix) to be odd, (k — i,) must be odd for all 1 <k <n; that is, one ofk, i, must be
even and the other odd. Since n is odd, nm = 2m + 1 and in the list 1, 2,..., n there are m even
integers and m + 1 odd integers. Let 1, 3, 5, ..., , be the pigeons and /,, é3, is, ..., én the
pigeonholes. At most m of the pigeonholes can be even integers, so (k — i,) must be even for at
least onek = 1, 3, 5,..., n. Consequently, { [p-1k — i,) is even.
17. Let the n distinct objects be x;, x2, ..., X,. Place x, in a container. Now there are two distinct
containers. For each of x;, x2,..., x, there are two choices, and this gives 2”~! distributions.
Among these there is one where x1, x2,..., Xn, are in the container with x,, SO we remove
this distribution and find S(n, 2) = 2"7' — 1.
19. a) and b) m!S(n, m)
21. Fix m = 1. Forn = | the result is true. Assume f o f* = f* o f andconsiderf o f*"'.
fof! = fo(fof)=folfiof)=(fofof
= filo f Hence fof" = frof
for all n € Z*+. Now assume that for some t > 1, f' o f” = f"o f'. Then
filo ft =(fo fof" =folfiof) =folfrofi) =(fo fo fi =
(fo f)o fi = fro(fof')= fro fit! so f"o f" = f" o f™ forallm, ne Z.
23. Proof: Leta € A. Then f(a) = g(f(f(a@))) = f(g f(fF@)))) = f(g 0 f°@). From
f(a) = g(f(f @)) we have f?(a) = (f o f(a) = f(g(f(f(a)))). So fla) =
f(go Pla) = fief ff@)) = 2?E@ = PeP?@) = fF FF@))) =
f(g(f(a))) = g(a). Consequently, f = g.
25, a) Note that 2 = 2!, 16 = 24, 128 = 27, 1024 = 2!°, 8192 = 2)3, and 65536 = 2'°. Consider
the exponents on 2. If four numbers are selected from {1, 4, 7, 10, 13, 16}, there is at least one
pair whose sum is 17. Hence if four numbers are selected from S, there are two numbers whose
product is 2!’ = 131072.
b) Leta, b,c,d,n€Z*. Let S = {b*, b*4, bo". be) TE [$] + 1 numbers are
selected from S then there are at least two of them whose product is b?"+"4,
27. fog ={a,2). 0,9). @ Oh go f ={@.4).0.0, @ Wh £1 = {@, 2), 0%), (YD:
g!={,y), (0). DE GOA! ={e.4), 0.2). & DES flog:
g lof l= {x z). (yy), @, x).
29, 23.2? . 3° = 7776 functions
31. a) (Toa)(x)=(aon)\(x)=x Db) a"(xX) =x -—ni ao" (x) =x4+n (n>2)
ec) wm "(x)H=xtnio "(x)=x—n (n>2)
33. a) S(8,4) ~~ —b) S(n, m)
35. a) Letm = 1 andk = 1. Then foralln > k, | f(n)| <2 <3 < |g(n)| = mlg(n)|, so f € O(g).
37. First note that if log, n = r, then n = a’ and log, n = log,(a’) = r log, a = (log, a)(log, 7).
Now let m = (log, a) and k = 1. Then for all n > k, |g(n)| = log, n = (log, a) (log, 2) =
m|f(n)|, so g € O(f). Finally, with m = (log, a)~' = log, b and k = 1, we find that for all
n>k,|f(n)| = log, n = (log, b)(log, n) = mlg(n)|. Hence f € O(g).
Chapter 6
Languages: Finite State Machines
Section 6.1—p. 317 1. a) 25:125 b) 3906 3.12 5. 780
7. a) (00, 11, 000, 111, 0000, 1111} b) {0, 1}
c) E* —{A, 00, 11, 000, 111, 0000, 1111} d) {0, 1, 00, 11}
e) =* f) E*—{0, 1, 00, 11) = a, OL, 10} U {w]||wl] = 3}
.a) xe AC Sx =ac,forsomeaeA,cECa>xeEBD,sinceeACB,CCD.
b) If AN # G, let x € AO. x © AO > x = yz, forsome y € A, z € J. Butz € # is impossible.
Hence AY = Y. [In like manner, @A = @.]
11. For any alphabet ©, let B C X. Then, if A = B*, it follows from part (f) of Theorem 6.2 that
A* = (B*)* = B* =A.
S-38 Solutions
13. a) Here A* consists of all strings x of even length where if x # A, then x starts with 0 and ends
with 1, and the symbols (0 and 1) alternate.
b) In this case A* contains precisely those strings made up of 3” 0’s, forn EN.
c) Here a string x € A* if (and only if)
(i) x is a string of 0’s, form € N; or
(ii) x is a string that starts and ends with 0, and has at least one 1 and at least two 0’s
between any two 1’s.
15. Let © be an alphabet with § # A C b*. If |A| = 1 andx ¢€ A, then xx = x since A? = A. But
|xx|| = 2||x|] = |lx|| => |lx]|] =O x =A. Tf|A| > 1, letx € A where ||x|| > 0 but ||x]] is
minimal. Thenx € A? => x = yz, for y, z € A. Since ||x|| = | y|| + llz/l, if [ly|l. ||z]] > 0, then
one of y, z isin A with length smaller than ||x ||. Consequently, one of || y|| or ||z|] is 0, soA € A.
17. If A = A’, then it follows by the Principle of Mathematical Induction that A = A” for all
ne Z*. Hence A = A*. By Exercise 15, A = A? A € A. Hence A = A’.
19, By Definition 6.11, AB = {ab|a € A, b © B}, and since it is possible to have a,b, = a2b, with
ad|,4. € A, a F dz, and b,, b. € B, b, F bo, it follows that |AB| < |A X B|] = |A||Bl.
21. a) The words 001 and 011 have length 3 and are in A. The words 00011 and 00111 have length
5 and they are also in A.
b) From step (1) we know that 1 € A. Then by applying step (2) three times we get
(i) LEAS OLIEA;
gi) 011 € A= 00111 € A; and
(iii) OO111 Ee AS OOOI111 EA.
c) If 00001111 were in A, then from step (2) we see that this word would have to be generated
from 000111 (in A). Likewise, 000111 in A = 0011 isin A > Ol is in A. However, there are no
words in A of length 2 in— fact, there are no words of even length in A.
23. a) Steps Reasons
1) () isin A. Part (1) of the recursive definition
2) (()) is in A. Step (1) and part (2-11) of the definition
3) (())() isin A. Steps (1) and (2) and part (2-i) of the definition
b) Steps Reasons
1) ()isinA. Part (1) of the recursive definition
2) (()) isin A. Step (1) and part (2-ii) of the definition
3) (())() isin A. Steps (1) and (2) and part (2-1) of the definition
4) (())()() isin A. Steps (1) and (3) and part (2-i) of the definition
25. Length 3: (j)+(j)=3 — Length4: (g) + (i) +(@) =5
LengthS: (j)+(/)+()=8 Length6: (§) + (7) + (3) + G) = 13 [Here the summand
(5) counts the strings where there are no 0’s; the summand (7) counts the strings where we
arrange the symbols 1, 1, 1, 1, 00; the summand (5) is for the arrangements of 1, 1, 00, 00; and
the summand (3) counts the arrangements of 00, 00, 00.]
27. A: (1) AEA
(2) IfaeéA, then 0a0, 0a1, laO, lal € A.
B: (1) 0,1eB.
(2) Ifa e B, then 0a0, 0a1, 1a0, lal € B.
Section 6.2—p. 324
- a) OO1O10OL; 5; ~b) 9000000; 5; ~—c)_- 001000000; so
- a) 010110 ib)
Solutions §-39
5. a) 010000; s, b) (s;) 100000; s2 c)
(s>) 000000; s » °
(s3) 110010; s2 0 1/0 1
SO SO S| 0 0
Ss, | Ss) Sz} 11
$2 | So S210 O
S3 SO S3 0 |
$4 | 82 83 10 1
d) s; e) x = 101 (unique)
7, a) (i) 15 (ii) 349 (iii) 2!> bb) 6
9. a) y o
0 1/0 #1
So | Sg Ss; | O O
Ss, | 83 Sp | 0 O
S2 53 S52 0 ]
8 | 8 8 | 90 O
$4 | 85 s3|0 O
SS S583 1 0
b) There are only two possibilities: x = 1111 or x = 0000.
c) A= {111}{1}* U {000} {0}*
d) Here A = {11111}{1}* U {00000}{0}*.
Section 6.3-—p. 332
Start
—_—— Start
§. b) (i) O11 (ii) 0101 (iii) 00001
c) The machine outputs a 0 followed by the first 2 — 1 symbols of the n symbol input string x.
Hence the machine is a unit delay.
d) This machine performs the same tasks as the one in Fig. 6.13 (but has only two states).
7. a) The transient states are so, s;. State s4 is a sink state. {5), 52, 53, 54, 55}, {sq}, and {s2, 53, ss}
(with the corresponding restrictions on the given function v) constitute submachines. The
strongly connected submachines are {s4} and {s2, 53, 55}.
b) States 52, s3 are transient. The only sink state is s4. The set {so, 5), 53, 54} provides the states
for a submachine; {so, s;} and {s4} provide strongly connected submachines.
5-40 Solutions
Supplementary
Exercises—p. 334 . a) True’ b) False cc) True’ 4d) True~ e) True _ f) True
pene,
3. Letx € © and A = {x}. Then A* = {xx} and (A*)* = {A, x7, x4, 2... A* = fA, x, x72, 207)
and (A*)? = A*, so (A*)* # (A*)’.
- Oo2 = {1, OO}{O} — On2 = {O}{1, OO}"{O} On = B
Coo = {1, OO}* — {A} Cio = {1}{1, OO}* U {10}{1, 0O}*
. a) By the pigeonhole principle there is a first state s that is encountered twice. Let y be the
output string that resulted since s was first encountered, until we reach this state a second time.
Then from that point on the output is yyy....
b)n on
00 00 0,0 —s
start OTT,1 ‘ 1
TT. \ ?
Ts 1 S$ i }
NN 01 NF
eee —_—_—"
1,0
11. a)
yp @
0 | QO |
(so, 53) | (80, $4) (81, 53) | 1 1
(So, $4) | (So, 83) (Sty 54) | OI
(51, 53) | (51, 53) (82, 53) | 11
(s}, $4) | (8), 84) (So, S4) | 11
(s2, 93) | (82, 53) (So, 54) | 11
(s2, $4) | (52, 54) (So, 53) | 1 O
b) @((so, 53), 1101) = 1111; M, is in state so, and M> is in state sq.
Chapter 7
Relations: The Second Time Around
Section 7.1 —p. 343
» a) {U, 1), @, 2), GB. 3), 4.4), CL, 2), 2. 1), @, 3), GB. 2)}
b) {, 1), 2. 2). G, 3), 44.0, 2)} oo) 1, 1, @, 2), 1, 2), 2. D)
. a) Let fi. f. Re F with fi(n) =n +1, fo(n) = Sn, and f3(n) = 4n 4 1/n.
b) Let 21, 22, g3 © & with g,(n) = 3, g2(n) = 1/n, and g3(n) = sina.
. a) Reflexive, antisymmetric, transitive _b) Transitive
c) Reflexive, symmetric, transitive d) Symmetric e) Symmetric
f) Reflexive, symmetric, transitive g) Reflexive, symmetric _h) Reflexive, transitive
. a) Forall x € A, (x, x) ER), Ro, so (x, x) E RK, NR, and KR, NR, is reflexive.
b) @ (%, yPE RNR DS , VER, Ar > (Cy, ER, Ro > CY, ¥) ER, NR, and
Ry, VR, is symmetric.
(ii) (x, y). (9, 4) ER, NR. => (x, y), (y, x) € Ry, Ro. By the antisymmetry of R,
(or Rz), x = y and NR, MR, is antisymmetric.
(iti) (x, y), (9, 2) ER, NR. > (x, y), CY, z) EC Ri, Ro > (x, z) E Ri, Ke (transitive
property) > (x, z) € Ry NR, so RK, M Ry is transitive.
. a) False: letA = {1, 2} and R = {(1, 2), (2, 1)}.
b) (i) Reflexive: true
(ii) Symmetric: false. Let A = {1, 2}, R; = {C1, 1)}, and R, = {(1, 1), C, 2)}.
(iii) Antisymmetric and transitive: false. Let A = {1, 2}, %, = {(1, 2)}, and
Ry = (C1, 2), (2, D}.
d) True.
11. ad) 60CNET
ee) 81
\=QQ=9
=f) 972
vw Is 9 (PE)
= OQ =30
Solutions S-41
13. There may exist an element a € A such that for all b € A, neither (a, b) nor (b, a) is inR.
15. r —n counts the elements in & of the form (a, b), a # b. Since R is symmetric, r — n is even.
17. a OO)+QG)+ 0G) » OG)+ 0G) + 0G)
d) (7) + OG) + OG) + OE)
Section 7.2—p. 354
~-RoFf ={0, 3), 1, H)}: SoR = {C1, 2), C1, 3), C1, 4), (2, 9}:
R2 = RK = (1, 4), (2,4), (4.9): 7 = F = {C, 1), C1, 2), C1, 3), C1, 9}
- (a, d) € (Ri, Oo Rp) o Rz =| (a, c) E Ry o Ry, (c, d) € R; for some ce C > (a, DV ER,
(b,c) € Ry, (c, d) € KR; for some be B, cEC DS (a, by ER, (bh. dV ER, ORD
(a, d) € Ry ° (Ry ° Rs), and (R, ° Ry) ° R3 Cc Ry ° (R, ° Rz)
~ KR, o (Ry2NR3) = Ry o {(m, 3), (m, H} = (1, 3), A, H}
(Ay o Rr) 1 Hy o Ra) = {1, 3), . HELL, 3), C1, 4} = tC, 3), C1, 49}
- This follows by the pigeonhole principle. Here the pigeons are the 2” + 1 integers between 0
and 2””, inclusive, and the pigeonholes are the 2"* relations on A.
221
. Consider the entry in the ith row and jth column of M(R, oR). If this entry is a 1, then there
exists b, € B where 1 <k <n and (a,, by) € Ri, (, c,) € Ky. Consequently, the entry in the
ith row and kth column of M(,) is 1, and the entry in the kth row and /th column of M(R2)
is 1. This results in a 1 in the ith row and /th column in the product M(R,) - M(R)).
Should the entry in row i and column j of M(&, oR) be 0, then for each b;, where
1 <k <n, either (a,, by) ¢ KR, or (by, c,) ¢ Ra. This means that in the matrices M(R,),
M(R.), if the entry in the ith row and kth column of M(&) is 1, then the entry in the kth row
and jth column of M(2) is 0. Hence the entry in the ith row and jth column of
M(R,) - M(Ap) is 0.
13. d) Let s,, be the entry in row (x) and column (y) of M. Then s,, appears in row (x) and
column (y) of M™. R is antisymmetric <> (sy, = Sy. =1Sx=y) => MOM" <I,
a > d <_.2 f 5 x
¥ y
b . . ¢ Vv y
Ys ;
cf (a) | w 2 {b)
17, (i) R= {(a, b), (b, a), (a, &), (@, a), (b,c), (c, b), (b, a), (d, b). (b, e). (e, »), (d, e),
(e. 4), d, f), (f, ad}:
(a) (6) (ec) (@) fe) Cf)
@foOo 1 0 0 1 0
(| 1 oO 1 1 1 0
MA=()|0 1 0 6 0 0
d)|/ Oo 1 OO 68 1 1
ey} 1 1 0 1 +0 0
(f)| 0 0 O 1 60 90
For part (ii) the rows and columns of the relation matrix are indexed as in part (i).
(il) R= {(a, b), (b, e), (d, b), (d.c), (e, AD):
OoOroeoTneoe
°
2°
ooooc oe
or
ore
- co Oo
coo
oo
M(R) =
-oCo
ooo
ooo
Qo
S-42 Solutions
19, R3 and Ki
4
21. a) 2° b) 2"
1 1 0 0 0 1 1 1 0 0
1 1 0 0 0 1 1 | 0 0
23. a) Ai:|/ 0 0 1 1 0 A: | 1 1 1 0 0
0 0 1 1 +0 0 0 0 1 1
0 0 0 0 1 0 00 1 =1
b) Given an equivalence relation & on a finite set A, list the elements of A so that elements in
the same cell of the partition (see Section 7.4) are adjacent. The resulting relation matrix will
then have square blocks of 1’s along the diagonal (from upper left to lower right).
25. (s;) a:
(82)
aOoWwveraooege
(53)
(4)
(ss)
(56)
(s7)
(sg)
27. n = 38
Section 7.3—p. 364
. Foralla € A, be B, we have a QR, a and b Rz b, so (a, b) R (a, b) and& is reflexive.
(a,b) R(c, d), (c, d)R (a,b) paRy c,cR, aandbR, d,dRobsea=c,b=d>
(a, b) = (c, d), soK is antisymmetric. (a, b) R (c, d), (c,. d) KR (e, f) Sa Ry, c, eR, e and
bR, d, d Ry f > aR e,b Ry f = (a, b) R (e, f), and this implies thatR is transitive.
~ W< {1} < {2} < {3} < {1, 2} < {1, 3} < {2, 3} < {1, 2, 3} (There are other possibilities.)
. a) 4 b) 3<2<1<4 or 3<1<2<4 e¢) 2
1
/ 2
NZ
. Let x, y both be least upper bounds. Then x & y, since y is an upper bound and x is a least
upper bound. Likewise, y AR x. R antisymmetric = x = y. (The proof for the glb is similar.)
11. Let U = {1, 2}, A = ACU), and let R be the inclusion relation. Then (A, KR) is a poset but not a
total order. Let B = {@, {1}}. Then (B X B) NRis a total order.
13. n+ (5)
Solutions §-43
15. a) The n elements of A are arranged along a vertical line. For if A = {a), a2, ..., a, } where
a, Ray Raz;R--+ Ra,, then the diagram can be drawn as follows:
b) x!
17. lub glib lub glb lub elb lub gib lub glb
a) {1,2} @ b) {1,2,3} @ c) {1,2} 6 d) {1,2,3} {1} e) {1,2,3} 8
19, For each a € Z it follows that a & a because a — a = 0, an even nonnegative integer. Hence R
is reflexive. If a, b, c€ Zwitha Rb and bR ce, then
a—b=2m, for some
m &N
b—c=2n, forsomen EN,
anda —c = (a —b) + (b—c) = 2(m +n), wherem +n EN. Therefore, a Rc and KR is
transitive. Finally, suppose that a & b and b Ra for some a, b € Z. Then a — b and b — a are
both nonnegative integers. Since this can only occur for a — b = b — a = 0, we find that
laRbAbRal| >a =b, so Ris antisymmetric.
Consequently, the relation & is a partial order for Z. But it is not a total order. For example,
2, 3 € Z and we have neither 2 & 3 nor 3 & 2, because neither —1 nor 1, respectively, is a
nonnegative even integer.
21. b) & c) Here the least element (and only minimal element) is (0, 0). The element (2, 2) is the
greatest element (and the only maximal element).
d) O, ORO, YVRO2QRA, OAR, INARA, 2) RK 2, 0) RK (2, 1) KR Q, 2)
23. a) False. Let U = {1, 2}, A = PU), and let RK be the inclusion relation. Then (A, %) is a
lattice where for all S, T € A, lub{S, 7} = SUT and glb{S, T} = SMT. However, {1} and
{2} arenot related, so (A, 9) is not a total order.
25. a) a b) a c)c de ez fhe gv
(A, &) is a lattice with z the greatest (and only maximal) element and a the least (and only
minimal) element.
27.a) 3 bh) m oe) 17) d) m+n+2mn e) 133
f) m+n+k+2(mn+mk+nk)+3mnk — g) 1484
h) m+n+k+£42(mn+mkiml+tnkt+né+kl)+3(mnk +mng+mké+nkl) +
4mnkeé
29. 429 = (4) (') sok = 6, and there are 2 - 7 = 14 positive integer divisors of p°q.
Section 7.4—p. 370
1. a) Here the collection A;, A>, A3 provides a partition of A.
b) Although A = A; U A; U A3 U Ag, we have A; M Az # , so the collection A,, Az, A3, Aq
does not provide a partition for A.
AK = {0, 1), Cd, 2), 2. D, (2, 2), B, 3), 3.9. 4 3), 44, 6. 5)}
. Ris not transitive since 1 R 2 and 2 RK 3 but 1 F 3.
Se
a) Forall (x, y)€ Ax ty =x+y > (x, vy) R(x, y).
(xy, Vi)R (x2, Ya) => H+ Yr = X2 + yo Xd + 2 = Xr FV => (2, 2) RH, Vi).
(xy. vi) R (x2, V2), (2, Yo) R (x3, V3) => Hr +) = 2 + yr, X2 + Yr = 3 + 3, SO
xy + y, = x3 + y3 and (x, y)) RK (x3, v3). Since RK is reflexive, symmetric, and transitive, it is
an equivalence relation.
b) [C,3)] = {, 3), @, 2), GB, Di 12, Y= (0, 5), 2. 9, (3. 3), 4G 2), G. D}
[d, D] = {d, D}
5-44 Solutions
ce) A={0, D}UIC, 2), 2, DP UIC, 3), @, 2), B, DIY {C, 4), @, 3), 3,2), 4. DPV
{(1, 5), (2, 4), (3, 3), 4, 2), . DEY {@, 5), B. 4), (4, 3), GS, 2} U
{(3, 5), (4, 4), GS, 3} V {4, 5), SG, HEU LG, 5}
» a) For all (a, b) € A we have ab = ab, so (a, b) R (a, b) and & is reflexive. To see that A is
symmetric suppose that (a, »), (c, d) € A and that (a, b) R (c, d). Then (a, bh) R (c,d) >
ad = be > cb = da => (c, d) R (a, b), so KR is symmetric. Finally, let (a, b), (c. d),
(e, f) € A with (a, b) KR (c, d) and (c, d) R (e, f). Then (a, b) R (c, d) > ad = bc and
(c, a) R(e, f) > cf = de, soadf = bcf = bde and since d # 0, we have af = be. But
af = be => (a, b) R (e, f), and consequently & is transitive. It follows from the above that R
is an equivalence relation on A.
b) [(2, 14)] = {@, 14} [(—3, -—9)] = {(-3, -9), (-1, —3), (4, 12)}
[(4, 8)] = {(—2, —4), C1, 2), G3, 6), (4, 8)}
c) There are five cells in the partition
— in fact,
A= [(-4, —20)] U [(—3, —9)] U[(-2, —4)] UL(-1, -1D] YI, 14)].
11. 9) 48 026 @ ()+49+29+O+O
15. Let {A,};<,; be a partition of a set A. Define 2 on A by x KR y if for some i € 1, we have
13 300
x,y €A,. For eachx € A, x, x € A, forsomei € J, sox &x and
& is reflexive.
xRy=>x, y €A,, forsomei ¢ 1 = y,x € A, forsomei ¢ 1 >y Rx, so VR is symmetric. If
x Ry and yRz, then x, y € A; and y, z € A; for some i, j € J. Since A, M A, contains y and
{A;},<; is a partition, from A, 1 A, # @ it follows that A, = A;, soi = j. Hence x, z € A,, so
xR z and & is transitive.
17. Proof: Since {B,, Bz, B3,..., B,} is a partition of B, we have B = B, UB, U Bz U---UB,.
Therefore A = f~'(B) = f-'(B, U---UB,) = f-'(B)) U---U f7'(B,) [by generalizing
part (b) of Theorem 5.10]. For 1 <i <j <n, f-'(B,)N f-(B,) = f-'(B; B,) =
f-'(@) = @. Consequently, { f~'(B,)|1 <i <n, f~'(B,) 4 GB) is a partition of A.
Note: Part (b) of Example 7.56 is a special case of this result.
Section 7.5—p. 376
. a) sy and ss are equivalent. b) s2 and ss are equivalent.
€) sz and s7 are equivalent; s3 and s4 are equivalent.
- a) s; and s; are equivalent; s4 and ss; are equivalent.
b) @ 0000 p @
(ii) 0 M::'O 1:0 4
(ili) OO
S] S34 S] I 0
52 S| AY] 1 0
53 |S Ss; |} 1 O
S54 53 S4 0 0
S56 52 S] I 0
Supplementary
Exercises—p. 378 . a) False. Let A = {1, 2}, J = (1, 2}, R = {(1, LD}, and Ry = {(2, 2)}. Then U,,., R, is
reflexive, but neither 92; nor KR is reflexive. Conversely, however, if &, is reflexive for all
(actually, at least one) i € /, then Ue; R; is reflexive.
. (a,c) ERz oR, = for some be A, (a, b) € Ro, (b, c) E Ry. With R,, Rs symmetric,
(b, a) € Ro, (ec, b) E Rj, so (c, a) ER, o Ry CR, oR). (c, a) ER. oR; > (ce, d) ER,
(d,a) € R,, for some d € A. Then (d, c) € Ro, (a, d) € R; by symmetry, and
(a,c) ER, o Rp, so R. o R, CAR, o R> and the result follows.
» (c, a) € (R, ORa)S SS (a, Cc) ER, oR. |] (a, b) ER, (b, c) € Ro, for some bE Bes
(b, a) 6 Ri, (c, b) € Ks, for some b € B <> (c, a) €E Rho Ri.
Solutions S-45
7. LetU = {1, 2,3, 4,5}, A = POU) — {U, HB}. Under the inclusion relation, A is a poset with the
five minimal elements {x}, 1 < x <5, but no least element. Also, A has five maximal
elements — the five subsets of U of size 4— but no greatest element.
9. n=10
11. a) Adjacency | Index b) Adjacency | Index °) Adjacency | Index
List List List List List List
1 2 1 |] | 2 1 1 | 2 1 1
2 3 2 2 2 3 2 2 2 3 2 2
3 1 3 3 3 | 3 3 3 1 3 3
4 4 4 $5 4 5 4 4 4 4 4 6
) 5 6 5 4 5 5 a) 5 7
6 3 6 8 6 6 6 I 6 8
7 5 7 #4
13. b) The cells of the partition are the connected components of G.
15. One possible order is 10, 3, 8, 6, 7,9, 1,4, 5, 2, where program 10 is run first and program 2 last.
17. b) [(0.3, 0.7)] = {(0.3, 0.7)} [(0.5, 0)] = {(0.5, 0)} [(0.4, 1)] = {(0.4, 1)}
[(0, 0.6)] = {(0, 0.6), (1, 0.6)} [(1, 0.2)] = {(0, 0.2), (1, 0.2)}
In general, if0 < a < 1, then [(a, b)] = {(a, b)}; otherwise, [(0, b)] = {(0, b), 1, b)} =
[(1, 6)].
¢) The lateral surface of a cylinder of height 1 and base radius 1/27.
19, 4” — 2(3") + 2”
21. a) (i) BRARC i) BRCRE
BR ARC R F is a maximal chain. There are six such maximal chains.
b) Here 11 &R 385 is a maximal chain of length 2, while 2:2 6 K 12 is one of length 3. The
length of a longest chain for this poset is 3.
ce) (i) BC {1} C {1, 2} C {1, 2, 3} CU; (ii) BE {2} © {2, 3} € {1, 2, 3} OU
There are 4! = 24 such maximal chains.
d) x!
23. Leta, AdasR--- Ra,-; Ra, be a longest (maximal) chain in (A, %). Then a, is a maximal
element if (A, &) anda; Ray R-+- Ra,_| is a maximal chain in (B, RK’). Hence the length
of a longest chain in (B, &’) is at least n — 1. If there is a chain b; R’ bo R’--«- R' b,, in
(B, R’) of length n, then this is also a chain of length n in (A, &). But then b,, must be a
maximal element of (A, &), and this contradicts b, € B.
25. Ifn = 1, then forall x, ye A, ifx # y then x R y and y R x. Hence (A, &) is an antichain,
and the result follows. Now assume the result true for n = k > 1, and let (A, &) be a poset
where the length of a longest chain is k + 1. If M is the set of all maximal elements in (A, &),
then M # and M is an antichain in (A, &). Also, by virtue of Exercise 23, (A — M, &’), for
KR’ = (A -— M) X (A— M)) NR, is a poset withk the length ofa longest chain. So by the
induction hypothesis, A— M = C,; UC) U---U Cy, a partition into & antichains.
Consequently, A = C; UC, U--» UC, U M, a partition into k + 1 antichains.
27. a) nb) 2"! ce) 64
Chapter 8
The Principle of Inclusion and Exclusion
Section 8.1—p. 396
1. Let x € S and let n be the number of conditions (from among c), €2, €3, €4) Satisfied by x.
(n =0): Here x is counted once in N(¢€7¢3€4) and once in N(C)€2€3C4).
(n= 1): If x satisfies c; (and not c>, c3, c4), then x is counted once in N(¢2¢3C€4) and once in
N(c1€2€3€4).
5-46 Solutions
If x satisfies c;, fori # 1, then x is not counted in any of the three terms in the equation.
(n = 2, 3,4): If x satisfies at least two of the four conditions, then x is not counted in any of
the three terms in the equation.
The preceding observations show that the two sides of the given equation count the same
elements from S, and this provides a combinatorial proof for the formula N(c2¢3¢4) =
N(c)€2€3€4) + N(€1C2C3€4).
.a) 12 b)3 5. a) 534) 458) 16
Ge
. 4,460,400 9, (27 )- (G1) + GG) — OG)
11. a5) (3) - (2) + AO] 13. 26! — [3(23!) + 24!] + (20! + 21)
15. e- (7)5® + (348 = (3)3° + (5)2* — QJ /6°
17.9 '/ (BD? ] -— 3 [7Y/1BY7]] + 361/39 — 3! 19. 651/7776 = 0.08372
21. a) 32, ~b) 96 e) 3200 23. a) 27! ~~ b) 2" '(p- 1)
25. a) 1600 _—b) 4399 27. (17) = (32) = (48) = 16
29. If 4 divides ¢(n), then one of the following must hold:
(1) # is divisible by 8;
(2) n is divisible by two (or more) distinct odd primes;
(3) n is divisible by an odd prime p (such as 5, 13, and 17) where 4 divides p — 1; or
(4) n is divisible by 4 (and not 8) and at least one odd prime.
Section 8.2-p. 401
. Ey = 768; EF, = 205; Ey = 40, Ex = 10; Ey = 0; Es = 1. a E, = 1024=N
3. a) [14!/(2)°] — (7) [13!/(24]+ G) [121/23]
— @) [tnty@2p?]+ () 101/21) — (2)[9!]
b) £2 = (9 [12729]- QE) [Nyen'] + HOU0/29
- QOL
& Ls= () [1y@p']
~ Ly= 6132; Lz = 6136
- (3 QAU/20+ QOL
a D3 -yaen’)1/@) bb (ee.cpr ag? a] G3)
c) (G9) - 3(3) (13) 1/3 )
Section 8.3—p. 403
. 10!— (7)9! + G8! — G)7!+ Ger-()s! 3. 44
Sou
~a) T—d, (d= Me!) Db) dog = (26!)e7!
n= 11 9. (0Ndiy = (10!)?(e7!)
. a) (dio)?
= (10)2e? db) YO 84(— (9) [10 — LF
13. For all n € Z*, n! counts the total number of permutations of 1, 2,3,..., n. Each such
permutation will have k elements that are deranged (that is, there are k elements Xy, X2, 00.4 Xp
in{1,2,3,..., n} where x, is not in position x), x2 is not in position x2,..., and x; is nor in
position x,) and n — k elements that are fixed (that is, the n — k elements y,, y2,.... ¥,—4 in
{1,2,3,..., a}— (x1, x2,..., 4} are such that y, is in position y,, y2 is in position y2,...,
and y,_; is in position y,_,).
The n — k fixed elements can be chosen in (,,” ,) ways, and the remaining k elements can
then be permuted (that is, deranged) in d, ways. Hence there are (,,” ,)d, = (j)d permutations
of 1,2,3,..., n with n — k fixed elements (and k deranged elements). As k varies from 0 ton
we count all of the n! permutations of 1, 2, 3, ..., 2 according to the number k of deranged
elements.
Consequently,
n= ((,)e + ("\a + (a fees ("a - » (jeu
15. (Gm — Dt — (@ —2!+ G)@—-3!—--- +)" 16,2 )OD + (12)
Solutions S-47
Sections 8.4
and 8.5—p. 410 a) (8) + G)8x + (8B Tx? + B)(8-7- Ox? + AB-7-6-S)xt te + G)BY28 =
5g (Q) PO, tx!
b) Sieg (YP(a. dx
~a) + 42°
(i) (1+2x)3) Gi) 14+ 8x + 14x?
(iii) 1+ 9x +25x24+21x* (iv) 14 8x 4+ 16x? 4+ 7x3
b) If the board C consists of n steps, and each step has k blocks, then r(C, x) = (1 + kx)”.
. 5!— 8(4!) +2103!) — 20(2!) +601) = 20 9% a) 20 db) 3/10
. (61/2!) — 9(5!/2!) + 27(41/2) — 31/2) + 12 = 63
Supplementary
Exercises—p. 413 1343 [ays] (2) -OG)+ OQ] — & Lon Qe -o!
1 o(—D* (2) (62) 0 — &)! = 1,764,651,461
9, Let T = (13!)/(2))°.
a) ([()09/29"] — [()G)@9/25] + [6)(3) 8d) /7
b) [T — (Es + Es)] /T, where Ey = [(7)(9/(2)] — [EQ] and Zs = QB
11. a) ("~") 13. 84
15. a) S; = {1,5, 7, 11, 13, 17} Sy = {2, 4, 8, 10, 14, 16}
S3 = {3, 15} Se = {6, 12}
Sy = {9} Sig = {18}
b) [Si] = 6 = ¢(18) |S3| = 2 = (6) |So] = 1= (2)
|S2] = 6 = (9) |S6] = 2 = (3) |Sig] = 1 = (1)
17. a) If nis even, then by the Fundamental Theorem of Arithmetic (Theorem 4.11) we may write
n = 2'm, where k > 1 and m is odd. Then 2n = 2**!m and 6(2n) = (2**') (1 — $) @(m) =
2kb(m) = 2 (2*) (4) Om) = 2 [2* (1 — 3) o(m)] = 2 [@ (24m) ] = 26).
b) When n is odd, we find that @(2n) = (2n) (1 — 5) LL. (1 — a) where the product is
taken over all (odd) primes dividing n. (If n = 1, then [|, (1 — >) is 1.) But
(2n) (1-4) T] in 1-4) =" 11 (1-4) = 90).
19, a) dy(12!)* bb) ({)a3(12)* ee) da(din)*
Chapter 9
Generating Functions
Section 9.1—p. 417
. a) The coefficient of x” in (1 +x +x?7+---+4+x’)4
b) The coefficient
of x7° in (1 tx+tx74- ++ x70)? (1 txPtxt pee tx)? or
(ltxtxrt---)? (lta? tat+---)?
c) The coefficient of x* in (x? + x3 +. x4) (xe txt t--.+x3)4
d) The coefficient of x°° in (1 +.x 42° +---+2°°)? (1 txr txt pe. 4x).
(xt x8 tad tee tx”) or(lL¢ x tx t+---)PP (4x? 4+274+--)) (x +x8 +x 4+)
. a) The coefficient of x!° in(1+x+x?+x°+---)®
b) The coefficient of x” in (1 +x +x? +29 4---)"
. The answer is the coefficient of x7! in the generating function
(lta tax? toh +e. Jl tx4x74---4x"),
Section 9.2—p. 431
.a) (+x) b) 8114+)’ ec) (14 x)7!
d) 63/1 +x) e) (L—x*) f) x?/(1—ax)
a) g(x) = f (x) — a3x* + 3x° = f(x) + 3-43) x°
b) g(x) = f(x) + B— a3) 2° + (7 —)) x?
S-48 Solutions
c) g(x) =2f(x)
+ (1 — 2a) x + 3 — 2a3) x3
d) g(x) =2f(x)
+ [5/1 — x)] + (1 — 2a, — 5) x + GB — 2a3 — 5) x7 + (7 — 2a, — 5) x?
-a (G) bd G*) 7 (19) — 5G) +)
-a) 0 bY (73) - 55) © (18) +414) + 6(73) +403) + (i)
11. (5) — 4 (si) + 6(3) 13. [(is) — (17) G2) + EC) — (2) / (6)
15. (1/8) [1 + (—1"] + 1/4) ("F') + 1/2)" *?)
17. (1 —x—x?-— x3 —x4— x9 — x®)7! = [1 — (x tx? fe+-+ x6]!
ST+ (etx? tee tx) t (x Fx? tee Fx) 4 (x tar tee tx peee,
one roll two rolls three rolls
where the 1 takes care of the case where the die is not rolled.
a) 24/27 = 1/8 b) Qla/2l 7Qn-l = 2!-[n/2)
19,
21. qlin—22)/2] 23. Qin/2)-1e Z(n/2)-1
25. a) Pr(Y¥ = y) = (5/6) '!(1/6), y = 1, 2,3,....
b) E(¥Y)=6— e¢) oy = V30 = 5.477226
27. 3/5
29. a) The differences are 2, 3, 2, 7, and 0, and these sum to 14.
b) {3,5,8,15} oc) {l+a,14+a4+b),1+a+b4ce,1l+a+b4+ce4+d}.
31. Ck = co i(k—i)f =k? pe i 2k YE P+ LG iP
= (k?) [k(k + 1)/2] — 2k[k(k + 1)(2k + 1)/6] + [K(k + 1)?/4]
= (1/12) (k?) (k? ~— 1)
33. a) (14x42? 4x9 424) (O+ x + 2x7 + 3x9 +---) = 2%) c.x' wherecy = 0, c) = 1,
7 =14+2=3,¢c3 =1424+3=6.c,=1+2+3+44= 10, and
C, =n+(n-1)4+MmM-—2)4+ (—-3)4+(—-4) =5n—- 10 foralln
> 5.
b) Q-x4tx?-x94---)(Laxtx?—-x'4---) = 1/42)? = (1 +x), the
generating function for the sequence (%). (7), (3). (4), .... Hence the convolution of the
given pair of sequences is cp, C1, C2,..., where c, = (57) = (-I)" Ctr!) = (- ("Ft!) =
(—1)"(n + 1),n EN. [This is the alternating sequence 1, —2, 3, —4, 5, -6,7,.... ]
Section 9.3—p. 435
»75641,54+2,54141,44+3,44241;4414141;34341534+242;
3424+141;34+1414141;2424241,242414141;2+1414+14141;
1+14+14+1414+141
. The number of partitions of 6 into 1’s, 2’s, and 3’s is 7.
- a) and b)
> 1
(ltx?taxtgah
st \ltxt prt 4e \(L¢ xox 4-0)... = ]] =
1=] — xe
. Let f(x) be the generating function for the number of partitions of n € Z* where no summand
appears more than twice. Then
f= ]] (4x) 427).
i=]
Let g(x) be the generating function for the number of partitions of n where no summand is
divisible by 3. Here
(x) 1 1 1 1 l
xX — . . = . eee
8 l—-x l-x*? Il-x4 1L-x 1- x7
But
Solutions $-49
f= (1+ x 42°) (1 tx? tx!) +27 43°) (Ltat $28)
l-x? 1—-x6 L-x? l-x!?
l—-x 1-x? 1-x3 1l-x4
l 1 1 1 St
Toe Tow Tow 8).
“723 Tee
9. This result follows from the one-to-one correspondence between the Ferrers graphs with
summands (rows) not exceeding m and the transpose graphs (also Ferrers graphs) that have m
summands (rows).
Section 9.4—p. 439
1. a) e* b) e** ce) e * d) ers e) ae’* f) xe**
3. a) g(x) = f(x) + [3 — a3) /311x°
b) g(x) = f(x) + [(-1 — ay) /3!) 03 = e* — [126x3/3))]
c) g(x) = 2 f(x) + [2 — 2a] x + [(4 — 2a) /2!] x?
: , 3 _ nx 1) ie Xe
25 “3 te 10\4
7. The answer is the coefficient of aI in aI + a peered To!
9. a) (1/2) [3° +1] / (3%) — by) (1/4) [37°43] / (3) ee) (1/2) [3° - 1] /(3°)
d) (1/2) [3-1]
/ (3°) — e) (1/2) [3 +1]
/ (3°)
Section 9.5—p. 442
I. a) (1+x4+x7)/(1—x) b) (I t+x4x°4x yan) ce) (1+2x)/(1
— x)?
5. do, d| — dp, 42 —@),03—@,... 7. f(x)= [e*/d —x)]
Supplementary
Exercises—p. 445 1. a) 6/1 —x)4+1/U-—x) b) 1/—-ax) o) 1/[1-C +a)x]
d) 1/(1—x) + 1/0 — ax)
3.5. [(3) —O@)+OF
Let f(x) be the generating function for the number of partitions of n € Z* in which no even
summand is repeated (an odd summand may or may not be repeated). Then
f@)=(tx 440 4-- JL 4x7) (Lee
4x8 to? te J ta%)--
1 1
“yay FY)
To Oe) I—x?
Let g(x) be the generating function for the number of partitions of n € Z* where no summand
occurs more than three times. Then
g(x) = (L+xtx?+x°) (14x? txt4x°) (Ltxi 4x8 +2’) ---
=[d +x) (1+x’)] [(1 +2?) (1 +x*)] [1 4x°) L+2x°)]---
=[(l-x°)/ /U—x)| (1 +x? UG 27a Fa)
[(l-x°)/(L-x*)] (L+2°)-
= (1/(l—x)) (1+x7) (1/(1-x .) (1 +x") (1/ (1—x°)) 1 +x°)--- = ft).
7. a) 1,5, 5)(7), (M9). HMOAA,... bi a=4,b=—-3
9. n (2-1) li. a) (2) ~~ —b) (°)° /(2)
13. a) [a+ (d—a)x]/1—x)? b) na+(1/2)(n)(n — 1d
1S. a) x" f(x) b) [ f(x) — (ao Fax tax? +++ + ay—px"!)] /x" 17. (1—p)"™
5-50 Solutions
Chapter 10
Recurrence Relations
Section 10.1—p. 455
1. a) a, = 5@,-1,2 >1,a@9=2 b) a, = —3a,_|.n > 1,a =6
C) a, = (2/5)a,-1,.n > 1,d, =7
3. d = +(3/7) 5. 141 months 7. a) 145 b) 45
9, a) 21345 ib) 52143,52134 ~— ee) 21534, 21354, 21345
Section 10.2—p. 468
1. a) a, = (3/7)(—1)" + (4/7)(6)".n >O0 — b) a, = 4(1/2)" — 215)", n > 0
€) ad, =3sin(nz/2),n>0 dd) a, = (5—n)3",n>0
e€) dy, = (V2)"[cos(37n/4) + 4 sin(37n/4)], n > 0
3. ad, = (1/10)[7" — (—3)"],n > 0
5. a) dy = 2dn_) + Gn_2,n > 2, a) = l,a; =2
an = (1/22). + V2)"*! — -— V2)"™*1],n = 0
b) a, = Gp—1 + 3ay-2, > 2, ay = 1,4, = 1
dy = (1/V 13) + V13)/2)"*! — (CL — V13)/2)"*1], 2 > 0
c) Gn = 2Gy_} + 3an_2, n>2,a9=1,a,=2
a, = (3/4)(3") + A/4)(-D", n= 0
7. a)
Fi = Fi
— Fo
fF, = Fy — F,
Fs = Fe
— F4
Fy) = Foy, — Fon-2
Conjecture: For all x € Z, F; + F; + Fs free pt Foy) = F>, - Fo = Fy,.
Proof (By Mathematical Induction): For n = 1 we have F; = F», and this is true since F, = 1 = F.
Consequently, the result is true in this first case (and this establishes the basis step for the proof).
Next we assume the result true for n = k (> 1)—thatis, we assume
Fi + F3 + Fs +--+ + Foxy = Fox.
When nv = k + 1, we then find that
Fi + F3 t+ Ps +--+ + Poe + Powys
= (FU + Fa t+ Fs +--+ + Fae) + Fora = Foe + Focgs = Forge = Faust.
Therefore, the truth for n = k implies the truth at n = k + 1, so by the Principle of Mathematical
Induction it follows that for all zn € Z*
Fi + F3 t+ Fs +--+ + Foy = Foy.
9 a, = (1/V5)K(C + V5)/2)"*! — (CL ~ V5) /2)"*!], n = 0
11. a) a, = G,_) + G,_-2,n > 3,a) = 2.2 = 3:4, = Fuso n> 1.
b) by, = by) + bya, 7 >3,b,= 1,6, = 3:6, = L,,n> 1,
13. a, = [(8 + 9V2)/16][2 + 4/2)" + [(8 — 9/2)/16][2 — 4./2]}", n > 0
15. a, = 2*", where F, is the nth Fibonacci number for n > 0
17. a) Far b) @) F, Git) Fy-1 Gili) Fyn c) 2+2:0, 24+3:1
d) These results provide a combinatorial proof that F,42 = (F, + Rip te - ++ A) +1.
19. (a, a), (B, B)
Solutions S-51
21. a) Proof (By the alternative form of the Principle of Mathematical Induction):
Fy=2= (14 V9)/2 > (14 V5)/2 =a = 07”,
Fy =3 = (34 V9)/2> G4 V5)/2=a*?,=0
so the result is true for these first two cases (where n = 3, 4). This establishes the basis step.
Assuming the truth of the statement for n = 3, 4, 5,..., k (> 4), where & is a fixed (but
arbitrary) integer, we continue now with n = k + 1:
Fy = Fy + Feat
> at? + a k-l)-2
_ ak? 4+ qk3 _ a*-3 (ay 4 1)
k 2
=a'3.@°=a k-1 = gy (k+1)-2_
Consequently, F, > a” * for all n > 3 — by the alternative form of the Principle of
Mathematical Induction.
23. An = 2dn—| + An—2, > 2, do = 1, a) = 3:
dy = (1/2) + V2)" + = ¥2)"*!], 2 = 0
25, (7/10)(7'°) + G/10)(—3)! = 197,750,389
27. An = Gn—| + An—2 + 2dn_3, n> 4, a) = 1, & = 2,43 =5:
ad, = (4/7)(2)" + (3/7) cos(2nz /3) + (/3/21) sin(2n7/3),n > 1
29. Xn = 4(2") —3,n>0 31. a, = J/51(4")
— 35,n>0
33. Since gced(F,, Fo) = 1 = ged(F2, F,), consider n > 2. Then
F; = Fy + Fi (= 1)
Fy = F3+Fy
Fs = Fy + F3
Fysi = Fa t+ Fai
Reversing the order of these equations, we have the steps in the Euclidean algorithm for
computing the ged of F,,,, and F,, n > 2. Since the last nonzero remainder is F; = 1, it follows
that gcd(F,41, F,) = 1 for all n > 2.
Section 10.3—p. 481
.a)a,=—(n+1)2,n>0 b) a, =3+n(n—1)?,n=0
©) ad, = 6(2")—5,n>0 dd) a, =2"4+nQ2""'),n>0
. a) a, =a,-; +n,n> lay =1 dy, = 1+ [n(n+1)]/2,n>0
b) by, = bp-1 +2, > 2, db) = 2, b, = 2n,n>1,b) = 1
a) ay = (3/4)(—1)" — (4/5)(—2)" + 1/20)3)", 2 > 0
b) a, = (2/9)(—2)" — (5/6)(1)(—2)" + (7/9), n = 0
. dy = A+ Bn+ Cr? — (3/4)n? + (5/24)n4 9, P = $117.68
11. a) a, = [(3/4)(3)" — 5(2)” + (7n/2) + (21/4)]'",n 20 b) a, =2,n >0
13. a) t, = 2t,-) +2" '.n>2,t =2:
th = (n+1)(22""'),n>1
b) t, = 44,1434 '), n> 2.% =4:
t = (14+3n)4"!,n>1
ec) ty =(l4+—Dalr’i n> lr =|) 21.
Section 10.4—-p. 487
~ a) a, = (1/21 +3"1,,n2=0 db) a, = 14+ [n(n — 1)(n — 1)]/6,n > 0
C) a, =5(Q2")-4,n>0 da, =2",n>0
§-52 Solutions
3, a) ad, = 2"(1 —2n), by = n(2"*'),n =O
b) a, = (-3/4) + 1/2) 4:1) + 1/4"),
= (3/4) + (1/2)(n + 1) — 0/4)G"), n = 0
Section 10.5—p. 493
= (8) /[S6)4)]= 14
Lpe seers
NON DN DP 8
3, 2n — —1\_ [ @n—-1)! (2n — 1)!
(", )- (7 »)- [S34 ]-la Seo
_ Ke —1)!("+ 4 _ i — in — |
(n+ 1)!(n — 1)! (n —1)'(n +1)!
(2n — 1)!
= Sao [(n+l]l)—-(—]
(2n — 1)1(2) _ @n - I)'2n) _ (2n)!
~(@tDin—-D! @t Dial @+ Dada)
_ 1 2n
- aa)
5. a) (1/9)(18) by LU/MEQ)P -o) LU/6)(8)ILA/9G)]_— dd) /6)('9)
7. a) |
|
|
| 2 (b(cd))
Cl
1 | oa a ((c) (lOc, a)
b) (iii) ((ab)c)d)e (iv) (ab) (c(de))
9, dy = Apdn—) + A1An-2 + G2Gn—3 +++ TF An-241 + Gn-14
Since dp = 1, a) = 1, a2 = 2, and a3 = 5, we find that a, = the nth Catalan number.
Weadx fi@ fi) fe) fa) fs)
1 1 3 2 2 l
2 2 3 2 3 3
3 3 3 3 3 3
b) The functions in part (a) correspond with the following paths from (0, 0) to (3, 3).
c) The mountain ranges in Fig. 10.24 of the text.
d) For n € Z*, the number of monotone increasing functions f:{1,2,3,..., 8} >
{1,2,3,...,n}, where f(i)>i forall l <i <n, isb, = (1/(at+ ING "), the nth Catalan
number. This follows from Exercise 3 in Section 1.5. There is a one-to-one correspondence
between the paths described in that exercise and the functions being dealt with here.
13. (1/(@ + 1))(*”), the nth Catalan number
Solutions §-53
15, a) E3 = 2 b) E, = 16
c) For each rise/fall permutation, n cannot be in the first position (unless n = 1); 1 is the
second component of a rise in such a permutation. Consequently, ” must be at position 2 or
4...or2|[n/2].
d) Consider the location of n in a rise/fall permutation «).%2x3 +++ X,-1X, Of 1, 2,3,...,. The
number n is in position 27 for some 1 <i < |n/2]. Here there are 2i — 1 numbers that precede
n. These can be selected in (3, — |) ways and give rise to £2;_, rise/fall permutations. The
(n — 1) — Qi — 1) =n — 27 numbers that follow n give rise to E,,_2, rise/fall permutations.
Consequently, £,, = Wn/?] (3) Bot Enzi, n> 2.
g) From parts (d) and (f)
Ee=("~*)eEo+("
n 1 1 &n-2 2!) exept 4(0 "7!
3 3Ln-4 2|n/2] | VE 2[n/2|-1 E &n—-2|n/2]
ge =("
n 0 Vee 0Ofn-1 4 ("To )ee
2 2 &n-3 4-4 2|(n 77!
— 1)/2] Ve 2L(n—1)/2) E nm -2[{n-1)/2]-1
Adding these equations we have
n-| n— 1
2B, = S(O
YE Eni or Ex
= (1/2) 93 ("7 JE Ent.
7=0 r= 0
h) £, = 61, Ej = 272
i) Consider the Maclaurin series expansions sec x = 1 + x?/2! + 5x4/4! + 61x°/6! + - -- and
tan.x = x + 2x3/3! + 16x°/5! + 272x7/7! +... One finds that sec x + tan x is the exponential
generating function of the sequence 1, 1, 1, 2, 5, 16, 61, 272, .. .— namely, the sequence of
Euler numbers.
Section 10.6—p. 504
. a) f(n) = (5/3)(4n'e84 — 1) and f € O(n'3+) forn € {3'|i EN}
—
b) f(n) = 7(log,n+ 1) and f € O(log, n) forn € {5'|i € N}
a) f € O(log, n) on {b*|k EN} b) f € O(n") on {hbk|k EN}
wa
-a) f(1)=0 f(a) =2f(n/2)+1
From Exercise 2(b), f(2) =n —1.
b) The equation f(n) = f(n/2) + (n/2) arises as follows: There are 2/2 matches played in
the first round. Then there are n/2 players remaining, so we need f(n/2) additional matches to
determine the winner.
. O(1)
“ss
a)
f(n) <af(n/b) + en
af (n/b) < a f (n/b*) + ac(n/b)
a’ f(n/b’) <a f(n/b*) +. a?c(n/b’)
a! f(n/bk') < a f(njb') + a*'e(n/b*")
Hence f(n) <a‘ f(n/b*) + en[1 + (a/b) + (a/by +--+ 4+ (a/b)"'] = a fA) +
cn[1 + (a/b) + (a/b)? +--+ + (a/b)*‘—'], because n = b*. Since f(1) < ¢ and (n/b*) = 1, we
have f(n) <cn[1 + (a/b) + (a/b)? +--+ + (a/b)! + (a/b)*] = (en) _, (a/by'.
$-54 Solutions
c) Fora # B,
k 1— (a/b)*+! _ , _ (a/b)**!
cm) (a/b) | T~ (a/b) ]-©0| 1— (a/b)
a lla | |
bk _ (a‘*! /b) pet! _ qkt! q**} _ Pkt!
= ¢ | ————— _| = c | ————_ = cc | —_ } .
d) From part (c), f(n) < (c/(a — b))[a**! — b*!] = (ca/(a — b))a* — (cb/(a — b))b*. But
ak = ql” = ple and bk = n, so f(n) < (ca/(a — b))n®®" — (cb/(a — b))n.
(i) When a < b, then log, a < 1, and f € O(n) on Z*.
(ii) When a > b, then log, a > |, and f € O(n'’®*) on Zt.
Supplementary
Exercises—p. 508 1. n ) _ n! _ (n—k)- n! _ (; =) (7)
k+1 (kK+1)!(2 -—k—-1)! (kK+1) k!~—k)! k+1 k
3. There are two cases to consider. Case | (1 is a summand): Here there are p(n — 1, k — 1) ways
to partition n — 1 into exactly k — 1 summands. Case 2 (1 is not a summand): Here each
summand $), 52,..., % > 1. For] <i <k, lett, =s,—1> 1. Thenz,%,..., &% providea
partition of n — k into exactly & summands. These cases are exhaustive and disjoint, so by the
rule of sum, p(n, k) = p(n—1,k —1) 4+ p(n —k, k).
= Frais Fp
5. b) Conjecture: Forn € Z*, A” | where F,, denotes the nth Fibonacci number.
Fy |
. _ ~ Ale 1 1 — | Fy, 2 Fl . oer
Proof: Forn=1,A=A EF 0| F, AI. so the result is true in this case.
Assume the result true for n = k > 1. That is, A‘ = ae ‘ | Forn =k +1,
k k-1
At = Aktl a= aki gg Fis Fy Pol} | Fear + Fe Frat | | Fepo 9 Fray
Fy Fy; || 1 0 Fy t+ Fey Fy Fyyy Fy
Consequently, the result is true for all n € Z*, by the Principle of Mathematical Induction.
7. (—1, 0), (a, «), (B, B)
9. a) Since a* = a + 1, it follows thata’? + 1 =2+a@and(2+a)? =44+4a+a? =
4(1 +a@)+a? = 5a’.
2n 3 2n 2 2k+m — Q2k+m
Cc) > (72) Fam = > (7) [fe |
k=0 k=0
2n 2n
_ 2n\ ak an 2nY\ | a2k am
= (1/(@ ~ B)) (er Life ‘B
=0 k=0
= (1/(@ — B))[a"(1. + a?) — pr + p’)°"]
= (1/(a@ — B))[o"(2 + a)" — B"(2 + B)""]
= (1/(a@ — B))[w"((2 + a)*)” — B"(2 + B)*)"]
= (1/(@ — B))[a”" (Sa°)" — Bp” (5B7)"]
= 5"(I/(a — B))[a?*" — 6") = 5" Fann
Ul. c, = Fy42, the (n + 2)-nd Fibonacci number
13. a) Fri b) (i) l= (1"-3°0) (ii) (n79'1) (iil) (." 979) (iv) (,"3°3) (v) faust
©) Frat = io (".°) ~ as (me)
15. a) For each derangement, 1 is placed in position i, where 2 <i <n. Two things then occur.
Case 1 (7 is in position 1): Here the other n — 2 integers are deranged in d,_» ways. With n — 1
choices for i, this results in (n — 1)d,-2» such derangements. Case 2 [/ is not in position | (or
position 7)]: Here we consider | as the new natural position for i, so there are n — | elements to
Solutions §-55
derange. With n — 1 choices for i, we have (n — 1)d,_, derangements. Since the two cases are
exhaustive and disjoint, the result follows from the rule of sum.
b) &=1 c) d, —nd,-; = d,-2 — (n — 2)d,-3
17. a) a,=("),n>0 b) r=1,s=-4,1=-1/2
d) b, = (1/2n —1))(?"),2 = 1s bp = 0
19. c=aorc=8 21. p=-—8
23. Gy = GQn-) + Qn_2,
> 3,4, = 1, a. = 2:4, = Fug, n> 1
25. a) (n=0) FP) -FoF, —- FZ =1?-0-1-0=1
(n=1) FF -F\Po- FP = 1? -1-1-P?=-1
(n=2) Fi) —FoFy3 —- FF =2?-1-2-P=1
(n = 3) Fj — FyFy— F2 =3? -2-3-2 =-1
b) Conjecture: For n > 0,
F2 1 n even
Fn Pati — Fy {1 n odd.
— —_— 2 _ ,
n+l
c) Proof: The result is true for n = 0, 1, 2, 3, by the calculations in part (a). Assume the result true
for n = k (> 3). There are two cases to consider — namely, k even and k odd. We shall establish the
result for k even, the proof for k odd being similar. Our induction hypothesis tells us that
F., — FFs — Fp = 1. Whenn =k +1 (> 4) we find that
Foo — Fai Fr — Fey = (Fe t+)? - Fei (Fia + Fe) — Fi, = Foo +2 FigFe + FP -
Feo — Fee — Fy = Fai + FR - FR = -LF2 - Fe Pia — FZ] = —1. The result
follows for all n € N, by the Principle of Mathematical Induction.
27. a) r(C),x)=1+x r(C4, x) = 14+ 4x 4 3x?
r(Co,x)=1+2x r(Cs5,x)= 14+5x4+ 6x? +33
r(C3,x)=14+3x4+x? r(Co, x) = 14+ 6x + 10x?
+ 4x3
In general, forn > 3, r(C,, x) = r(Cy_1, X) xr (Cy_2, x).
b) r(C;, 1) =2 r(C3, lI =5 r(Cs, 1) = 13
r(Co, D=3- r(C4, 1) = 8 r(C,, 1) = 21
(Note: For 1 <i <n, if one “straightens out’ the chessboard C, in Fig. 10.28, the result is a
| X i chessboard —like those studied in Exercise 26.]
29. a) The partitions counted in f(n, m) fall into two categories:
(1) Partitions where m is a summand. These are counted in f(n — m, m), for m may occur
more than once.
(2) Partitions where m is not a summand— so that m — | is the largest possible summand.
These partitions are counted in f(n, m — 1).
Since these two categories are exhaustive and mutually disjoint, it follows that f(n, m) =
f(n—m,m)+ f(n,m — 1).
Chapter 11
An Introduction to Graph Theory
Section 11.1—p. 518
1. a) To represent the air routes traveled among a certain set of cities by a particular airline.
b) To represent an electrical network. Here the vertices can represent switches, transistors, and
so on, and an edge (x, y) indicates the existence of a wire connecting x to y.
c) Let the vertices represent a set of job applicants and a set of open positions in a corporation.
Draw an edge (A, b) to denote that applicant A is qualified for position /. Then all open
positions can be filled if the resulting graph provides a matching between a subset of the
applicants and the open positions.
3. 6 5. 953
S-56 Solutions
r44
b) {(g, d), (d, e), (e, a)}; {(g, 5), (b,c), (ce, d), (d, e), (e, @)}
c) Two: one of {(b, c), (c, d)} and one of {(b, f), Cf, 2), (g, d)}
d) No
e) Yes. Travel the path {(c, d), (d, e), (e, a), (a, b), (6, f), Cf. 8)}
f) Yes. Travel the trail {(g, b), (6, f), Cg), (g. d), (d, B), (b, €), (ce, d), (d, e), (e, a),
(a, b)}.
. If {a, b} is not part of a cycle, then its removal disconnects a and } (and G). If not, there is a
path P from a to b, and P together with {a, b} provides a cycle containing {a, b}. Conversely,
if the removal of {a, b} from G disconnects G, then there exist x, y, € V such that the only path
P from x to y contains e = {a, b}. If e were part of a cycle C, then the edges in
(P — {e}) U(C — {e}) would contain a second path connecting x to y.
11. a) Yes b) No ec) n-1l
13. The partition of V induced by & yields the (connected) components of G.
15. The number of closed v — v walks of length n > 1 is F,,,,, the (7 + 1)-st Fibonacci number.
Section 11.2—p. 528
.a) 3 b) G; = (U), where U = {a, b, d, f, gh,i, 7};G) = G — {c)
c) G2 = (W), where W = {b, c,d, f, 8, i, j}; Go = G — fa, h}
d) be e) :
; Cc d be
— f c q
3. a) 2?=512 b)3~ oe) 2°
5. G is (oris isomorphic to) K,, where n = |V|.
7. (i) R Y Ww BE (11) No solution
B} 1 |y eR} 2 |B y¥/ 3 |R Ww «| Ww
W B Y R
(iii) Ww B Y R
R 1 Ww W 2 B Y 3 R B 4 Y
Y R B Ww
9. a) No _ b) Yes. Correspond a with u, b with w, ¢ with x, d with y, e with v, and f with z.
11. a) If G; = (V,, E,) and G2 = (V2, Ez) are isomorphic, then there is a function f: V; > V>
that is one-to-one and onto and preserves adjacencies. If x, y € V, and {x, y} ¢ E), then
{f(x), f(y)} € Ex. Hence the same function f preserves adjacencies for G;, G> and can be
used to define an isomorphism for G,, G2. The converse follows in a similar way.
b) They are not isomorphic. The complement of the graph containing vertex a is a cycle of
length 8. The complement of the other graph is the disjoint union of two cycles of length 4.
13. If G is the cycle with edges {a, b}, {b, c}, {c, d}, {d. e}, and {e, a}, then G is the cycle with
edges {a, c}, {c, e}, fe, b}, {b, d}, and {d, a}. Hence G and G are isomorphic. Conversely, if G
is a cycle on n vertices and G, G are isomorphic, then n = $(5), orn = 4(n)(n — 1), andn = S.
Solutions $-57
e d
15. a) Here f must also maintain directions. So (a, b) € F, if and only if (f(a), f(b)) € Fo.
b) They are not isomorphic. Consider vertex a in the first graph. It is incident to one vertex and
incident from two other vertices. No vertex in the other graph has this property.
17. nv —3n4+3
Section 11.3—p. 537
- a) |Vl|=6 ~~ Db) |[V| =1 or2 or3 or 5 or6 or 10 or 15 or 30
(In the first four cases, G must be a multigraph; when |V| = 30, G is disconnected.)
c) |V|=6
- a) [Vil = 8 = [Vo]; |E1| = 14 = | Ep
b) For V, we find that deg(a) = 3, deg(b) = 4, deg(c) = 4, deg(d) = 3, deg(e) = 3,
deg( f) = 4, deg(g) = 4, and deg(h) = 3. For V, we have deg(s) = 3, deg(t) = 4, deg(u) = 4,
deg(v) = 3, deg(w) = 4, deg(x) = 3, deg(y) = 3, deg(z) = 4. Hence each of the two graphs
has four vertices of degree 3 and four of degree 4.
c) Despite the results in parts (a) and (b), the graphs G; and G2 are not isomorphic.
In the graph G; the four vertices of degree 4 — namely, f, u, w, and z— are ona cycle of
length 4. For the graph G, the vertices b, c, f, and g — each of degree 4— do not lie on a cycle
of length 4.
A second way to observe that G, and G; are not isomorphic is to consider once again the
vertices of degree 4 in each graph. In G, these vertices induce a disconnected subgraph
consisting of the two edges {b, c} and { f, g}. The four vertices of degree 4 in graph G» induce a
connected subgraph that has five edges — every possible edge except {u, z}.
7a) 19 by) Or, (4) (Note: No assumption about connectedness is made here.)
9. a) 16 b) 2'° = 524,288
11. The number of edges in K,, is (3) = n(n — 1)/2. If the edges of K,, can be partitioned into such
cycles of length 4, then 4 divides (5) and (5) = 47, for some ¢ € Z*. For each vertex v that
appears in a cycle, there are two edges (of K,,) incident to v. Consequently, each vertex v of K,,
has even degree, so n is odd. Therefore, n — 1 is even and as 4¢ = (5) = n(n — 1)/2, it follows
that 8t = n(n — 1). So 8 divides n(n — 1), and since n is odd, it follows (from the Fundamental
Theorem of Arithmetic) that 8 divides n — 1. Hence n — | = 8k, orn = 8k 4+ 1, for some
keZ.
13. d|Vi < ev deg(v) < A|V|. Since 2)E| = ouev deg(v), it follows that 5|V| < 2|F| < A|V,
sod <2(e/n) <A.
15. Start with a cycle vj > v2 —> v3 > +++ > Vox_| > V2, — Vv). Then draw the k edges {v, vziy},
{v2, Ugo}, .--, (Ur, Usk}, ..-, {Ug, Vx}. The resulting graph has 2k vertices each of degree 3.
S-58 Solutions
17. (Corollary 11.1). Let V = V; U V2, where V;(V2) contains all vertices of odd (even) degree.
Then 2|E| — vers deg(v) = }) .<y, deg(v) is an even integer. For |V,| odd, },-y, deg(v) is
odd.
(Corollary 11.2). For the converse let G = (V, £) have an Euler trail with a, b as the
starting and terminating vertices. Add the edge {a, »} to G to form the larger graph
G, = (V, E,) where G, has an Euler circuit. Hence G, is connected and each vertex in G, has
even degree. When we remove edge {a, b} from G,, the vertices in G will have the same even
degree except for a, b; deg (a) = deg, (a) — 1, deg,,(b) = deg; (b) — 1, so the vertices a, b
have odd degree in G. Also, since the edges in G form an Euler trail, G is connected.
19. a) Leta, b,c, x, ye V with deg(a) = deg(b) = deg(c) = 1, deg(x) = 5, and deg(y) = 7.
Since deg(y) = 7, y is adjacent to all of the other (seven) vertices in V. Therefore vertex x is
not adjacent to any of the vertices a, b, and c. Since x cannot be adjacent to itself, unless we
have loops, it follows that deg(x) < 4, and we cannot draw a graph for the given conditions.
21. n odd; n = 2 23. Yes
25. a) (i) 13 (ii) 25 (iii) 41 (iv) 2n? —2n +1
b) (12 (ii) 24 (iii) 40 (iv) 2n? — 2n
27. In any directed graph (or multigraph), )>-, od(v) = |E| = }> ,<y id(v), so
>- .cylod(v) — id(v)] = 0. For each v € V, od(v) + id(v) = 2 — 1, so
0=(n~1)-0= J °(@—1)[od(v) — id(v)]
vEeV
= J “lod(v) + id(v)]lod(v) — id(v))
veV
= “od(w))? — Gd),
veV
and the result follows.
29. a) and b)
31. Let |V| =n > 2. Since G is loop-free and connected, for all x € V we have | < deg(x) <
n — 1. Apply the pigeonhole principle with the vertices as the pigeons and the n — | possible
degrees as the pigeonholes.
33. a) Yes b) Yes ec) No
35. No. Let each person represent a vertex for a graph. If v, w represent two of these people, draw
the edge {v, w} if the two shake hands. If the situation were possible, then we would have a
Solutions $-59
graph with 15 vertices, each of degree 3. So the sum of the degrees of the vertices would be 45,
an odd integer. This contradicts Theorem 11.2.
37. Assign the Gray code {00, 01, 11, 10} to the four horizontal levels: top — 00; second (from the
top) —01; second (from the bottom) — 11; bottom — 10. Likewise, assign the same code to the
four vertical levels: left (or, first) — 00; second— 01; third — 11; right (or, fourth) — 10. This
provides the labels for p,, p2,..., Pie, where, for instance, p, has the label (00, 00), p2 has
the label (01, 00), ..., p7 has the label (11, 01), ..., pi; has the label (11, 11), ..., pis has
the label (11, 10), and py. has the label (10, 10).
Define the function f from the set of 16 vertices of this grid to the vertices of Q4 by
i (ab, cd)) = abcd. Here f ((ab, cd)) = f((a,), €\d\)) => abcd = ayb\c\d, > a =a),
b=b,,c=c,d = d, > (ab, cd) = (a,b), c\d,) > f is one-to-one. Since the domain and
codomain of f both contain 16 vertices, it follows from Theorem 5.11 that f is also onto.
Finally, let {(ab, cd), (wx, yz)} be an edge in the grid. Then either ab = wx and cd, yz differ
in one component or cd = yz and ab, wx differ in one component. Suppose that ab = wx and
c= y, but d z. Then {abcd, wxyz} is an edge in Q4. The other cases follow in a similar way.
Conversely, suppose that { f((aib1, c14))), f (wir, ¥1Z1))} is an edge in Qy. Then a,b,c) d),
w\x,¥12Z, differ in exactly one component — say the first. Then in the grid, there is an edge for
the vertices (Ob), c,d)), (1b), c,d)). The arguments are similar for the other three components.
Consequently, f establishes an isomorphism between the three-by-three grid and a subgraph of
Q,4. (Note: The three-by-three grid has 24 edges while Q4 has 32 edges.)
Section 11.4—p. 553
. In this situation vertex b is in the region formed by the edges {a, d}, {d, c}, {c, a}, and vertex e
is outside of this region. Hence the edge {, e} will cross one of the edges {a, d}, {d, c}, or
fa, c}, (as shown).
. a) Graph Number of Vertices Number of Edges
Kay 11 28
Ky.) 18 77
Kincn m+n mn
b) m=6
. a) Bipartite b) Bipartite ¢) Not bipartite
- a) (3)(3) b) m(3) +n(3) = (1/2)Gnn)[m +n — 2]
c) (m)(n)(m — In — 1) = 4(%)(3)
- a) 6 — b) (1/2)(7)(3)(6)(2)(5)
C1) (4) = 2520 c) 50,295,168,000
d) (1/2)(n)Qm)(n — I) (m — 1)(@ — 2)--- 2) — (m +: 1)))(n — m)
11. Partition V as V; U V2 with |V;| = m,|V2| = v — m. If G is bipartite, then the maximum number
of edges that G can have is m(v — m) = —[m — (v/2)]? + (v/2)?, a function of m. For a given
value of v, when v is even, m = v/2 maximizes m(v — m) = (v/2)[v — (v/2)] = (v/2)*. For v
odd, m = (v — 1)/2 orm = (v + 1)/2 maximizes m(v — m) = [(v — 1)/2][v — (Cu — 1)/2)] =
[(v — 1)/2](@@ + 1)/2] = [@ + 1)/2][v — (vu + 1)/2)] = (? = 1)/4 = Lv/2)?] < (v/2)’.
Hence if |E| > (v/2)*, then G cannot be bipartite.
5-60 Solutions
13. a) a a:{1,2} f': {4, 5}
“RS b: {3,4} -¢: {2,5}
: bye 7 c: {1, 5} h: {2, 3}
g d:{2,4} i: {1,3}
VEX e: {3. 5} i: {1,4}
d c
b) G is (isomorphic to) the Petersen graph. [See Fig. 11.52(a).]
15. mn must be even
17. a) There are 17 vertices, 34 edges, and 19 regions, and vy —e +r = 17 — 344 19 =2.
b) Here we find 10 vertices, 24 edges, and 16 regions, and vy —e +r = 10-244 16=2.
19, 10
21. If not, deg(v) > 6 for all v € V. Then 2e = > veV deg(v) > 6|V| so e > 3|V], contradicting
e < 3|V| — 6 (Corollary 11.3).
23. a) 2e>kr =k2Q+e-v) > (2-—ke>k(2-—v) pe < [k/(K -—2)]v—-2) b4
c) In K33, we have e = 9 and v = 6. [k/(k — 2)](v — 2) = (4/2)(4) = 8 < 9 =e. Since K33
is connected, it must be nonplanar.
d) Here k = 5, v = 10, e = 15, and [k/(k — 2)](v — 2) = (5/3)(8) = (40/3) < 15 = e. The
Petersen graph is connected, so it must be nonplanar.
25, a) The dual for the tetrahedron [Fig. 11.59(b)] is the graph itself. For the graph (cube) in
Fig. 11.59(d) the dual is the octahedron, and vice versa. Likewise, the dual of the dodecahedron
is the icosahedron, and vice versa.
b) Forn € Z*,n > 3, the dual of the wheel graph W, is W, itself.
27.
|
e Sf
im
29. a) As we mentioned in the remark following Example 11.18, when G,, G2 are homeomorphic
graphs, then they may be regarded as isomorphic except, possibly, for vertices of degree 2.
Consequently, two such graphs will have the same number of vertices of odd degree.
b) Now if G, has an Euler trail, then G, (is connected and) has all vertices of even
degree — except two, those being the vertices at the beginning and end of the Euler trail. From
part (a) G2 is likewise connected with all vertices of even degree, except for two of odd degree.
Consequently, G» has an Euler trail. (The converse follows in a similar way.)
c) If G; has an Euler circuit, then G, (is connected and) has all vertices of even degree. From
part (a) G2 is likewise connected with all vertices of even degree, so G, has an Euler circuit.
(The converse follows in a similar manner.)
Section 11.5-—p. 562
. a) WA b) A c) d) V7
. a) Hamiltoncycle:a—> g>k->irhob+c+d>j-fr-e>a
b) Hamilton cycle:a>d—-+>boe>+g>j-rir foh-+»coa
c) Hamilton cycle:a—>h>e-> f>g->ir-d>cob-a
d) Hamilton path:a—-c73d->b->e- fog
e) Hamilton path:a> b+c+>d->e-> jrivch>go» frkolisem>n->o
Solutions S-61
f) Hamilton cycle:a +» bocod>e>jrirnh>gol>-mono>o7>tH-
soroqgopp-kofoa
§. d) If we remove any one of the vertices a, b, or g, the resulting subgraph has a Hamilton cycle.
For example, upon removing vertex a we find the Hamilton cycle b-> d+>c-— f+ ge
—> b.
e) The following Hamilton cycle exists if we remove vertex g:ad > b> c—>d—>e-> jroo
senoichom+>l+k— f >a. Asymmetric situation results upon removing vertex i.
7. a) (1/2)n—1)! b) 10 oo) 9
9. Let G = (V, E) be a loop-free undirected graph with no odd cycles. We assume that G is
connected — otherwise, we work with the components of G. Select any vertex x in V, and let
V, = {v € V|d(x, v), the length of a shortest path between x and v, is odd} and
V2 = {w € V|d(x, w), the length of a shortest path between x and w, is even}. Note that
(i) x € Vo, Gi) V = V, U Vy, and (iii) V; MN V2 = @. We claim that each edge {a, b} in E has one
vertex in V, and the other vertex in V). For suppose that e = {a, b} € E witha, b € V\. (The
proof for a, b € V» is similar.) Let E, = {{a, vy}, {vy, va}... {Um—1, x}} be the m edges ina
shortest path from a to x, and let E;, = {{b, vj}, {u;, vb}. ..., {u,_,, x}} be the n edges ina
shortest path from to x. Note that m and n are both odd. If {v;, v2, ..., Uni} A {uj}, U3, ---,
v)_,} = @, then the set of edges E’ = {{a, b}} UE, U E, provides an odd cycle in G.
Otherwise, let w (# x) be the first vertex where the paths come together, and let E” =
{{a, b}} U {{a, v1}, {v, v2}, sees {v,, wh} U {{b, vi}, {v,, U5}, ees {v,, wh},
for some | <i <m-— 1 and1 < j; <n —1. Theneither E” provides an odd cycle for G or
E’ — E” contains an odd cycle for G.
11. a) a a
c b c b
b) a a . b
d . c
id(a) = 90 od(a)=3 id(a)=0
id(b) = 1 od(b) = 1 id(b) = 2
id(c) = 3 od(c)= 1 id(c) =2
id(d) =2 od(d) =1 id(d) =2
a b a . b
' ~t
> ‘
d : c d - c
od(a)=1 id(a) =2 od(a) =0 id(a) =3
od(b) = 1 id(b) =2 od(b) =2 id(b) = 1
od(c) =2 id(c) = 1 od(c) = 2 id(c)=1
od(d)=2 id(d) = 1 od(d@) =2 id(d)=1
13. Proof: If not, there exists a vertex x such that (v, x) ¢ & and, forally eV, y # vu, x, if
(v, y) < E, then (y, x) ¢ E. Since (v, x) ¢ E, we have (x, v) € E, as T is a tournament. Also,
for each y mentioned earlier, we also have (x, vy) € E. Consequently, od(x) > od(v) + 1—
contradicting od(v) being a maximum!
15. For the multigraph in the given figure, |V| = 4 and deg(a) = deg(c) = deg(d) = 2 and
deg(b) = 6. Hence deg(x) + deg(y) > 4 > 3 = 4 — 1 for any nonadjacent x, y € V, but the
S-62 Solutions
multigraph has no Hamilton path.
17. For n > 5, let C, = (V, E) denote the cycle on n vertices. Then C,, has (actually is) a Hamilton
cycle, but for all v € V, deg(v) = 2 < 2/2.
19, This follows from Theorem 11.9, since for all (nonadjacent) x, y € V,
deg(x) + deg(y) = 12 > 11 = |VJ.
21. When n = 5, the graphs Cs; and Cs are isomorphic, and both are Hamilton cycles on five
vertices.
For n > 6, let u, v denote nonadjacent vertices in C,. Since deg(u) = deg(v) = n — 3, we
find that deg(u) + deg(v) = 2” — 6. Also, 2n —6>n <>n > 6, soit follows from Theorem
11.9 that the cocycle C,, contains a Hamilton cycle when n > 6.
23. a) The path v —> v) —> v2 -> v3 > +++ —> v,_, provides a Hamilton path for H,,. Since
deg(v) = 1, the graph cannot have a Hamilton cycle.
b) Here |E| = ("> 3) + 1. (So the number of edges required in Corollary 11.6 cannot be
decreased.)
25. a) (i) {a,c, f, A}, fa, gg} Gi) {z}, {u,w, y} —-b) (i) B(G) = 4 (ii) B(G) = 3
ec) @3 di3 an)3 (iv)4 (v)6 (vi) The maximum of m andn
d) The complete graph on |/| vertices
Section 11.6—p. 571
. Draw a vertex for each species of fish. If two species x, y must be kept in separate aquaria,
draw the edge {x, y}. The smallest number of aquaria needed is then the chromatic number of
the resulting graph.
-a)3 b)5
.a) P(G,A)=A~AA-1)P
b) For G = K;,, we find that P(G, 4) =A(A — 1)". X(Kin) = 2
» a) 2 b) 2 (n even); 3 (n odd)
c) Figure 11.59(d): 2; Fig. 11.62(a): 3; Fig. 11.85(i): 2; Fig. 11.85(ii): 3d) 2
Ja) ()AA—DP2A—-2) = (2) aAAa—NYa—2~a? —2a +2)
(3) AQ — L(A — 2)? — 5A +7)
b) (1)3 (2) 3 (3)3— e) (1) 720 (2) 1020 (3) 420
11. Let e = {v, w} be the deleted edge. There are A(1)(A — 1)(A — 2)--- (A — (n — 2)) proper
colorings of G, where v, w share the same color and A(A — 1)(A — 2) --- (A — (n — 1)) proper
colorings where v, w are colored with different colors. Therefore, P(G,, 4) =A(A — 1) ---
(A—n+2)+AQA—1)---A-n+)D=AQGQ-1)---A-n4+3)A—-—2n 42), 50
X(Gr) =n~ 1.
13. a) |V| = 2n; |E| = (1/2) Dey deg(v) = (1/2)[4(2) + Qn — 4)3)] = (1/2)[8 + 6n — 12] =
3n—2,n> 1.
b) Forn = 1, we find that G = Kz and P(G, A) =A(A — 1) = AQ — 1)? — 34. +3)!"! 80
the result is true in this first case. For n = 2, we have G = C4, the cycle of length 4, and here
P(G, dA) =AQ— 13 -AA — 1A — 2) = AQ — 1)? — 34 + 3)". So the result follows
for n = 2. Assuming the result true for an arbitrary (but fixed) n > 1, consider the situation for
n+ 1. Write G = G; U Go, where G, is C4 and G2 is the ladder graph for n rungs. Then
G,G, = K>, so from Theorem 11.14 we have P(G, A) = P(G,, 4)- P(G2, A)/P(K2, A)
= [(A)A — DQ? — 3A +3)[A)A— Da? — 3443" ')/ [AA - DI] =
(A)(A — 1)? — 3A + 3)”. Consequently, the result is true for all n > 1, by the Principle of
Mathematical Induction.
15. a) A(A—1)(A-—2) _ b) Follows from Theorem 11.10
Solutions S-63
c) Follows by the rule of product
d) P(Cy, A) = PC Pai A) — P(Cn-1, 4) = ACA = WY"! = P(Cy-1, A)
=[A-1I +H @—1)"! = P(C.-1,a)
=(A- 1)" + (A= 1"! — P(Cr-1, 4),
so P(C,, A) — A-— 1)" = (A -— DP! = P(Cy_1, A).
Replacing n by n — 1 yields
P(Cn-1.4)- A— 1)! = A= 19"? = P(Cy-2, A).
Hence
P(C,, A) — (A — 1)" = P(Cy_2, A) — A 1)".
e) Continuing from part (d),
P(Cy, A) = (A 1)" + (1)
3 P(C3, A) -— A= 17
=(A—1"+(-)"! [A@ — Da-2)-a- 13]
= (A—1)" + (-1)"A — 1).
17. From Theorem 11.13, the expansion for P(G, 4) will contain exactly one occurrence of the
chromatic polynomial of K,,. Since no larger graph occurs, this term determines the degree as n
and the leading coefficient as 1.
19. a) Forn € Z*,n > 3, let C,, denote the cycle on n vertices. If n is odd then x(C,,) = 3. But for
each v in C,, the subgraph C,, — v is a path with n — | vertices and x(C, — v) = 2. Soforn
odd C,, is color-critical.
However, when n is even we have x(C,,) = 2, and for each v in C,,, the subgraph C,, — v is
still a path with n — 1 vertices and x(C,, — v) = 2. Consequently, cycles with an even number
of vertices are not color-critical.
b) For every complete graph K,, where n > 2, we have x (K,,) =n, and for each vertex v in
K,, Kn — v is (isomorphic to) K,_1, so x (K, — v) = n — 1. Consequently, every complete
graph with at least one edge is color-critical.
c) Suppose that G is not connected. Let G, be a component of G where x(G,) = x(G), and
let G2 be any other component of G. Then x(G,) > x(G2) and for all v in G2 we find that
x(G — v) = x(G1) = x(G), so G is not color-critical.
Supplementary
Exercises—p. 576 -n=17
3. a) Label the vertices of K, witha, b,..., f. Of the five edges on a, at least three have the
same color, say red. Let these edges be {a, b}, {a, c}. {a, d}. If the edges {b, c}, {c, d}, {b, d}
are all blue, the result follows. If not, one of these edges, say {c, d}, is red. Then the edges
{a, c}, {a, d}, {c, d} yield a red triangle.
b) Consider the six people as vertices. If two people are friends (strangers), draw a red (blue)
edge connecting their respective vertices. The result then follows from part (a).
. a) Wecan redraw G> as
u w y
v x Zz
b) 72
- a) 1260 —b) 756
c) (Case 1: pis odd, p = 2k + 1 for k € N.) Here there are mn paths of length p = 1 (when
k = 0) and (m)(n)(m — 1)(n — 1) --- (m —k)(n — &) paths of length p = 2k + 1 > 3.
(Case 2: p is even, p = 2k fork € Z*.) When p < 2m (i.e., k < m) the number of paths of
length p is (1/2)(m)(n)(m — I) —1)---@ —K&- I) —k) + 1/2) @n)(in — 1) +
S-64 Solutions
(m —1)---(m—(k —1))(n —k). For p = 2m we find (1/2)(2)(m)(n — 1)Qm -—1)---
(m — (m — 1))(n — m) paths of (longest) length 2m.
. a) Let 7 be independent and {a, b} € FE. If neither a nor bis in V — /, thena, b € 7, and since
they are adjacent, 7 is not independent. Conversely, if / C V with V — / a covering of G, then
if 7 is not independent there are vertices x, y € J with {x, vy} € E. But {x, y} € E = either x or
yisinV —f7.
b) Let / be a largest maximal independent set in G and K a minimum covering. From part (a),
IK] <|V —7| = |V|—|f[ and |/| > |V— K|=|V|—|K|, or|K| +|7| > |V| >= |K|+]/|.
11. Gn = An—| + An-2, Ag = a, = 1 Gy, = Fyi1, the (n + 1)-st Fibonacci number
13. Gn = On—\ + 24n-2,4, = 3,42 =5 ay = (—1/3)(—-1)" + (4/3) 2"), n= 1.
15. a) y(G) = 2; B(G) = 3; x(G) =4
b) G has neither an Euler trail nor an Euler circuit; G does have a Hamiltonian cycle.
c) G is not bipartite, but it is planar.
17. a) x(G)>a@(G). _ b) They are equal.
19, a) The constant term is 3, not 0. This contradicts Theorem 11.11.
b) The leading coefficient is 3, not 1. This contradicts the result in Exercise 17 of Section 11.6.
c) The sum of the coefficients is —1, not 0. This contradicts Theorem 11.12.
21. a) dy, = Fy42, the (2 + 2)-nd Fibonacci number.
c) Ai: 34+ Fe Ay:34+ F; A323 + Frys d) 2)?—l+m
Chapter 12
Trees
Section 12.1—p. 585
"NMS |INS DO TR . AUT TO]
b) 5
3. a) 47 _—b) 11 5. Paths 7. b
ed
2 Cc
9, If there is a unique path between each pair of vertices in G, then G is connected. If G contains a
cycle, then there is a pair of vertices x, y with two distinct paths connecting x and y. Hence, G
is a loop-free connected undirected graph with no cycles, so G is a tree.
11. n (5)
13. In part (i) of the given figure we find the complete bipartite graph K>.3. Parts (ii) and (iii)
provide two nonisomorphic spanning trees for K>3. Up to isomorphism these are the only
spanning trees for K23.
M mM
15. (1) 6 (2) 36
17. a})n>m4+1
b) Let k be the number of pendant vertices in 7, From Theorems 11.2 and 12.3 we have
2(n — 1) = 2|E| = do ey deg(v) > k + m(n — k). Consequently,
[2 —-1)>k+m(n —k)]=> [2n —2>k +mn — mk]
=> [kim—1)>2-—2n+mn=24+ (m—2)n>24(m—2)(m4+1)
=2+m—~—m—-—2=m—m=m(m—1)],
sok >m,
Solutions S-65
19. a) If the complement of 7 contains a cut-set, then the removal of these edges disconnects G,
and there are vertices x, y with no path connecting them. Hence T is not a spanning tree for G.
b) If the complement of C contains a spanning tree, then every pair of vertices in G has a path
connecting them, and this path includes no edges of C. Hence the removal of the edges in C
from G does not disconnect G, so C is not a cut-set for G.
21, a) (i) 3, 4, 6, 3, 8,4 (ii) 3, 4, 6, 6,8, 4
b) No pendant vertex of the given tree appears in the sequence, so the result is true for these
vertices. When an edge {x, y} is removed and y is a pendant vertex (of the tree or one of the
resulting subtrees), the deg(x) is decreased by 1 and x is placed in the sequence. As the process
continues, either (i) this vertex x becomes a pendant vertex in a subtree and is removed but not
recorded again in the sequence, or (ii) the vertex x is left as one of the last two vertices of an
edge. In either case, x has been listed in the sequence [deg(x) — 1] times.
c) 3
2 6 5 /_#4
7
1 8
d) Input: The given Priifer code x;, x2, ..., Xn-2
Output: The unique tree T with n vertices labeled with 1, 2, ..., 2. (This tree has the Priifer
code x1, %2,..., Xn_2-)
C := [x], X2,..., Xn—2] {Initializes C as a list (ordered set)}
L:=([1,2,...,7] {Initializes L as a list (ordered set)}
for (:= 1ton—2do
v:= smallest element in Z not in C
w := first entry in C
T :=T U{f{v, w}} {Add the new edge {v, w} to the present forest.}
delete v from L
delete the first occurrence of w from C
T:=TU {{y, z}} {The vertices y, z are the last two remaining entries in L.}
23. a) If the tree contains n + 1 vertices, then it is (isomorphic to) the complete bipartite graph
K,., — often called the star graph.
b) If the tree contains n vertices, then it is (isomorphic to) a path on n vertices.
25. Let E, = {{a, b}, {b, c}, {c, d}, {d, e}, {b, h}, {d, i}, (Ff, i}, fg, }} and
Ey = {{a, h}, {b, i}, {he i}, (g. A}, Cf 8}, fe. t), fd. fh, fe, Fh.
Section 12.2—p. 603
. a) fi hk, p.g.s,t b)a od
d) e. f,j.g,s.t e) g.t f) 2 g) ky p.q,s,t
. a) /+w-xy*eartz23 b) 04
Ge
. Preorder: vr, j, 4, 2,e,d,b, a,c, f,i,k,m, p,s,n,q,t, vu, wu
Inorder: h,e,a.b,d,c,g, f. j.i,rom.s, pp, k,n, v,t, wg, u
Postorder: a,b,c, d,e, f. g.h,i, j, 8, pom, vu, wit,u, qn, kyr
. a) (i) and (iii) a (en
S-66 Solutions
b) (i) re (ii) po (iii) ee
oO
4
~H
of ec , f
, l, é e
9. G is connected, Mi
Ve \ Vy
11. Theorem 12.6
a) Each internal vertex has m children, so there are mi vertices that are the children of some
other vertex. This accounts for all vertices in the tree except the root. Hence n = mi + 1.
b) €4+i=n=mi+1l>2=(m-1)i4+1
ce) €=(m—)i+1>5:=(-—1)/(n-1)
n=mi+1>t=(n—1)/m
Corollary 12.1
Since the tree is balanced, m"~! < £ < m" by Theorem 12.7.
m'! < &€<m" = log,,(m"—') < log,,(€) < log,,(m")
=> (h—1) <log, €<h=h = flog, €|
13. a) 102; 69
15. a) b) 9:55 c) A(m—1);(h-1)4+(m-1)
55)
17. 21845: 1+m+m*4---4+m"! = (m" —1)/(m—-1)
19, {1, 2,3,4} - {9, 10, 11, 12} - {5, 6, 7, 8]
oo
11,2}- (3,4) "$9. 10} — 111,
12} — {5, 6} — {7, 8}
_15}
— {6}
{1} — {2} {3} — [4] e {1}
- (12) eT {7}- {g}
{1} @ {2} {3B} @ {4} {9} @B {10 {11} B {12} {5} B 6} {7} B {8}
21. (6) (3) (Ga) (3) = 204,204 (zz) (in) (5) G@) = 235,144
23. a) 1,2,5, 11, 12, 13, 14, 3, 6, 7, 4, 8, 9, 10, 15, 16, 17
b) The pieorder traversal of the rooted tree
Section 12.3—p. 609
.a) L;: 1, 3,5, 7,9 L>: 2, 4, 6, 8, 10
b) £;:1,3,5,7,...,2m—3,m+n
L,:2,4, 6,8,...,2m—2,2m—1,2m,2m+4+1,..., mt+tn—l
Solutions $-67
3. a) {-1, 0,2, -2, 3, 6, -3, 5, 1, 4}
{0, 2,3, 6,5, 1,4]
Section 12.4—p. 614
1. a) tear bb) tatener c¢) rant
3. a: 111 c: 0110 e: 10 g: 11011 i: 00
hb: 110101 d: 1100 f: Ol h: 010 J: 110100
§. 55,987
30 30
7.
10 J \x of \ 20
10 / A 10 /\
i\d510 \10
5A 5 « »
, 2 3
é
2 3
Amend part (a) of step (2) for the Huffman tree algorithm as follows. If there are n (> 2) such
trees with smallest root weights w and w’, then
(i) if w< w’ andn — | of these trees have root weight w’, select a tree (of root weight w’)
with smallest height; and
(ii) if w = w’ (and all z trees have the same smallest root weight), select two trees (of root
weight w) of smallest height.
Section 12.5—p. 621
1. The articulation points are b, e, f, h, j,k. The biconnected components are B,: {{a, b}};
Bo: {{d, e}}; B3: {{b, ch, {c, fh; {f, e}, {e, b}}; Ba: {tf 8}, {g, h}, {h, Sf};
Bs: {{h, th, {i, 7}, GAY}; Bos (7, A}; Bo: {{k, ph. {p,m}, {n,m}, {m, k}, {p, my}.
. a) T can have as few as one or as many as n — 2 articulation points. If 7 contains a vertex of
degree (n — 1), then this vertex is the only articulation point. If T is a path with n vertices and
n — | edges, then the n — 2 vertices of degree 2 are all articulation points,
b) In all cases, a tree on n vertices has n — | biconnected components. Each edge is a
biconnected component.
- X(G) = max{x (B;)|1 <i < ky}.
. Proof: Suppose that G has a pendant vertex, say x, and that {w, x} is the (unique) edge in E
incident with x. Since |V| > 3, we know that deg(w) > 2 and thatck(G —w) >2>1=x(G).
Consequently, w is an articulation point of G.
. a) The first tree provides the depth-first spanning tree T for G where the order prescribed for
the vertices is reverse alphabetical and the root is c.
b) The second tree provides (low’(v), low(v)) for each vertex v of G (and T). These results
follow from step (2) of the algorithm.
For the third tree, we find (dfi(v), low(v)) for each vertex v. Applying step (3) of the
algorithm, we find the articulation points d, f, and g, and the four biconnected components.
$-68 Solutions
, cq, 1)
i
9c(1) (2, 1) a2, Ie |
/ j de
¢ (2) @ (1, 1) /\ %6@,1)
/\ 3,24
(24 *b@) f2, 2% vou.) \ i
i\ i\ fe. be, 2)
gi4)¢ pels) 93,3)¢ wel2,2) hy
i f / / 94, 3)@ 6 a(7, 3)
h(S)@ al?) A(4, 4)@ (3, 3) g¢
é
AGS, 4)
11. We always have low(x2) = low(x,) = 1. (Note: Vertices x. and x; are always in the same
biconnected component.)
13. If not, let vy € V where v is an articulation point of G. Then «(G — v) > x(G) = 1. (From
Exercise 19 of Section 11.6 we know that G is connected.) Now G — v is disconnected with
components H,, Ho,..., H,, fort > 2. For | <i <1t, letv, € H,. Then H, + vis a subgraph of
G —u,41,and x(H, + v) < x(G — v4) < x(G). (Here v,,; = v;.) Now let x(G) = n and let
{C1}, C2,..., Cn} be a set of m colors. For each subgraph H; + v, 1 <i <t, we can properly
color the vertices of H, + v with at most n — 1 colors — and can use c; to color vertex v for all
of these ¢ subgraphs. Then we can join these f subgraphs together at vertex v and obtain a
proper coloring for the vertices of G where we use less than n (= x (G)) colors.
Supplementary
Exercises —p. 625 . If G is a tree, consider G a rooted tree. Then there are A choices for coloring the root of G and
(A — 1) choices for coloring each of its descendants. The result then follows by the rule of
product.
Conversely, if P(G, 4) = A(A — 1)""!, then since the factor 4 occurs only once, the graph G
is connected. P(G, A) =A(A— 1)" 1 =A2®-—(n— Dat! +--+ (- 1D" A > G hasn
vertices and (n — 1) edges. Hence G is a tree [by part (d) of Theorem 12.5].
~ a) 1011001010100
b) (i) (ii)
c) Since the last two vertices visited in a preorder traversal are leaves, the last two symbols in
the characteristic sequence of a complete binary tree are 00.
. We assume that G = (V, E) is connected — otherwise we work with a component of G. Since
G is connected, and deg(v) > 2 for all v € V, it follows from Theorem 12.4 that G is not a tree.
But every loop-free connected undirected graph that is not a tree must contain a cycle.
. For 1 <i (<n), let x, = the number of vertices v where deg(v) = i. Then x} +4. +---+
Xn-1 = |V| = |E] 4+ 1,80 2|E| = 2(-1 +x) +x) +--+ +%,-1). But 2/E| = )o ey deg(v) =
(x; + 2x2 + 3x3 4+-+-+ (n — 1)x,_1). Solving 2(—1 + x) + x2. +--+ +4, 21) = x) +242 +
-+++(n — 1)x,_; for x,, we find that x; = 2+ x3 + 2x4 + 3x5 +---4+(n —3)x,_| =
2+ Dy aegtn, 23 [deg(v,) ~ 2].
. a) G’ is isomorphic to Ks. b) G? is isomorphic to K4.
c) G’ is isomorphic to K,,1, so the number of new edges is ("3 ') — n = (5).
d) If G* has an articulation point x, then there exists u, v € V such that every path (in G’) from
u to v passes through x. (This follows from Exercise 2 of Section 12.5.) Since G is connected,
there exists a path P (in G) from u to v. If x is not on this path (which is also a path in G’), then
we contradict x being an articulation point in G*. Hence the path P (in G) passes through x,
Solutions S-69
and we can write P: iu —> uy > +++ > Un_| —> Un —> X D> Vy SD Um_-] D+ ++ Vv} @ Vv. But
then in G? we add the edge {u,,, v,,}, and the path P’ (in G*) given by P’: u—> uy > +++ >
Un—| —> Un —> Un —> Um—| > +++ —> Uv; — v does not pass through x. So x is not an articulation
point of G*, and G? has no articulation points.
11. a) £, = €,-1 + €,-2, forn > 3 and €,; = £2 = 1. Since this is precisely the Fibonacci
recurrence relation, we have /, = F,,, the nth Fibonacci number, for # > 1.
b) i, =tn-) +in-2+ 1, n> 3,4) =i. =0
in = (1//5)oe" — 1/5)" -1= F, —1,n=1
13. a) For the spanning trees of G there are two mutually exclusive and exhaustive cases: (i) The
edge {x;, y,} is in the spanning tree: These spanning trees are counted in b,,. (ii) The edge
{x1, ¥,} is not in the spanning tree: In this case the edges {x,, x2}, {¥1, y2} are both in the
spanning tree. Upon removing the edges {x), x2}, {y:, y2}, and {x,, y,} from the original ladder
graph, we now need a spanning tree for the resulting smaller ladder graph with n — | rungs.
There are a,_; spanning trees in this case.
b) 6, = by; + 2a,-},n > 2
c) a, — 4a,_-; + a,-2 =O, n > 2
dn = (1/(2V3))L2 + V3)" — (2 — V3)"], 2 = 0
15. a) (i) 3 (ii) 5
D) a, = Gy-) + y_2, 8 > 5S, a3 = 2, ag = 3
ayn = F,4,, the (n + 1)-st Fibonacci number
17. Here the input consists of
(a) the k (> 3) vertices of the spine — ordered from left to right as v,, v2, ... , vg;
(b) deg(v,), in the caterpillar, for all 1 <i <k; and
(c) n, the number of vertices in the caterpillar, with n > 3.
If k = 3, the caterpillar is the complete bipartite graph (or star) K,,,-;, for some n > 3.
We label v, with 1 and the remaining vertices with 2, 3, ..., n. This provides the edge
labels (the absolute value of the difference of the vertex labels) 1,2, 3,...,n—l,a
graceful labeling.
For k > 3 we consider the following.
1:=2 {/ is the largest low label}
h:=n-1 {fh is the smallest high label}
label v, with 1
label v. with n
fori :=2tok —1do
if 2|:/2] =7 then {i is even}
begin
if v; has unlabeled leaves that are not on the spine then
assign the deg(v,) — 2 labels from/ to/ + deg(v;) — 3
to these leaves of v,
assign the label / + deg(v,) — 2 to v, 4,
£:=1+4deg(v,) -— 1
end
else
begin
if v, has unlabeled leaves that are not on the spine then
assign the deg(v,) — 2 labels from h — [deg(v,) — 3] to
h to these leaves of v,
assign the label h — deg(v,) +2 to v,44
h:=h —deg(v,) + 1
end
19. a) 1, —1, 1,1, -1, -1 1,1,—-1,1,—-1, -1 1,—1,1,-1,1, ~1
S-70 Solutions
"K J \
\
J
f
KK
.
\
\
AN
\
\
In total there are 14 ordered rooted trees on five vertices.
c) This is another example where the Catalan numbers arise. There are (——) (7”) ordered
rooted trees on n + | vertices.
21. a) 8 b) 8 ce) 4.83 d) 2(4-8*) — e) 2(n8")
Chapter 13
Optimization and Matching
Section 13.1-p. 638
1. a) If not, let v, € S, where 1 <i < mandi is the smallest such subscript. Then d(vp, v,) <
d(vo, Um+1), and we contradict the choice of v,,,; as a vertex v in § for which d(vp, v) isa
minimum.
b) Suppose there is a shorter directed path (in G) from vo to v,. If this path passes through a
vertex in S, then from part (a) we have a contradiction. Otherwise, we have a shorter directed
path P” from vp to vz, and P” only passes through vertices in S. But then P” U {(vx, Ug4y),
(Ugsts Vets 62s Umi. Un), ms Un+1)} is a directed path (in G) from vo to v.41, and it is
shorter than path P.
3. a) d(a,b)=5; d(a,c)=6; d(a, f)=12; d(a,g)=16; d(a,h)= 12
b) f: (a,c). (ce, f) g: (a, b), (b, h), (A, g) h: (a, b), (b, h)
5. False. Consider the following weighted graph. y 2 y,
=
Vy
Section 13.2—p. 643
1. Kruskal’s algorithm generates the following sequence (of forests), which terminates in a
minimal spanning tree 7 of weight 18.
(1) Fi = {fe, hy} (2) F, = Fi U {{a, b}} (3) Fs = Fy U{{b, c}}
(4) Fy = F3 U Ud, e}} (5) Fs = Fy U {fe, fh} (6) Fe = Fs U {{a, e}}
(7) Fy = Fe Ut{d, gh} 8)Fe=T=FUUS i)
(This answer is not unique.)
3. No! Consider the following counterexample: ‘
Vv 1 w
Here V = {v, x, w}, E = {{v, x}, {x, w}, fv, w}}, and E’ = {{v, x}, {x, wh}.
5. a) Evansville-[ndianapolis (168); Bloomington-Indianapolis (51); South Bend—Gary (58);
Terre Haute—Bloomington (58); South Bend—Fort Wayne (79); Indianapolis—Fort Wayne (121).
b) Fort Wayne—Gary (132); Evansville-Indianapolis (168); Bloomington-Indianapolis (51);
Gary—South Bend (58); Terre Haute—Bloomington (58); Indianapolis~Fort Wayne (121).
7. a) To determine an optimal tree of maximal weight, replace the two occurrences of “small” in
Kruskal’s algorithm by “large.”
b) Use the edges: South Bend—Evansville (303); Fort Wayne—-Evansville (290);
Gary-Evansville (277); Fort Wayne—Terre Haute (201); Gary-Bloomington (198);
Indianapolis-Evansville (168).
9. When the weights of the edges are all distinct, in each step of Kruskal’s algorithm a unique edge
is selected.
Solutions S-71
Section 13.3—p. 658
=4 hb) 18
ce) Gi) P= {a,b,h, d, g, i}; {z} (ii) «~P = {a, b, h, d, g}; P = {i, 2}
(iii) P = {a, h}; P = {b, d, gi, z}
3. (1) b 15,14 d (2)
( 86 4 12,12 k
The maximum flow is 32, The maximum flow is 23,
which is c{P, P} for which is c{P, P} for
P= {a, b, d, g, h} and P= {i, z} P= {a} and P= {b, g,i,j,d,h, k, 2
5. Here c(e) is a positive integer for each e € E, and the initial flow is defined as f(e) = 0 for all
e € E. The result follows because A,, is a positive integer for each application of the
Edmonds-Karp algorithm and in the Ford-Fulkerson algorithm, f(e) — A, will not be negative
for a backward edge.
7. b 44 d 64 fF
7,7 4,3 41 5,5
45,0 45,0 45,0
44 "4,2 5,5
Section 13.4—p. 665
1. 5/(§) = 1/14
3. Let the committees be represented as c), C2, ... , Cs, according to the way they are listed in the
exercise.
a) Select the members as follows: c; — A} c2 — G:c3 — M3 ca — Ni c5 — Ki 06 — R.
b) Select the nonmembers as follows: c, — K;c. — Ay c3 — G3e4 — S35 — M36 — P.
5. a) Aone-factor for a graph G = (V, £) consists of edges that have no common vertex. So the
one-factor contains an even number of vertices, and since it spans G, we must have |V| even.
b) Consider the Petersen graph as shown in Fig. 11.52(a). The edges
{e, a} {b, c} {d, i} fg, Jf} {f, A}
provide a one-factor for this graph.
c) There are (5)(3) = 15 one-factors for Kg.
d) Label the vertices of K>, with 1, 2,3,...,2n— 1, 2n. We can pair vertex 1 with any of the
other 2n — | vertices, and we are then confronted, in the case where n > 2, with finding a
one-factor for the graph K2,_2. Consequently,
ay, = (2n — 1)dn-1, aq, = 1.
We find that
ad, = (2n — l)a,_, = (2n — 1)(2n — 3)ay_2 = (2n — 1)(2n — 3)(2n — S)a,_-3 = ++:
= (2n — 1)(2n — 3)Qa — 5) --- S)G3)CD)
_ (Qn)Qn — 1)(2n — 2)(2n — 3)--- (4)3)2)0) _ @n)!
(2n)(2n — 2)--- (4)(2) 2" (n!)
7. Yes, such an assignment can be made by Fritz. Let X be the set of student applicants and Y the
set of part-time jobs. Then for all x € X, y € Y, draw the edge (x, y) if applicant x is qualified
for part-time job y. Then deg(x) > 4 > deg(y) for all x € X, y € Y, and the result follows from
Corollary 13.6.
S-72 Solutions
. a)(i) Select i from A, for 1 <i <4.
(ii) Select i + 1 from A, for 1 <i <3, and 1 from Ag.
b) 2
11. For each subset A of X, let G4 be the subgraph of G induced by the vertices in A U R(A). If e
is the number of edges in G4, then e > 4|A| because deg(a) > 4 for all a € A. Likewise,
e < 5|R(A)| because deg(b) < 5 for all b € R(A). So 5|R(A)| > 4|A| and 6(A) = |A| — | R(A)|
< |A| — (4/5)|A| = (1/5)|A| < (1/5)|X| = 2. Then since 6(G) = max{5(A)|A C X}, we have
6(G) <2.
13. a) 6(G) = 1. Amaximal matching of X into Y is given by {{x,, ya}, (x2, yo}, (x3, vi},
{x5, y3}}-
b) If 8(G) = 0, there is a complete matching of X into Y, and 6(G) = |Y|, or |¥| =
B(G) — 6(G). If 8(G) =k > 0, let A C X where |A| — |R(A)| =&. Then A U (Y — R(A))
is a largest maximal independent set in G and B(G) = |A| + |¥ — R(A)| =
I¥| + (JA| — |R(A)]) = |¥| + 8(G), so |¥| = B(G) — 6(G).
c) Fig. 13.30(a): {x), x2, 43, V2, Va, Ys}; Fig. 13.32: {x3, x4, yo, y3, ya}.
Supplementary
Exercises —p. 669 1, d(a,b) =5 d(a,c)= 11 d(a,d)=7 d(a,e)=8
d(a, f) = 19 d{a, g) =9 d(a, h) = 14
[Note that the loop at vertex g and the edges (c, a) of weight 9 and (f, e) of weight 5 are of no
significance.]
. a) The edge e, will always be selected in the first step of Kruskal’s algorithm.
b) Again using Kruskal’s algorithm, edge e, will be selected in the first application of step (2)
unless each of the edges e), e2 is incident with the same two vertices — that is, the edges e), e3
form a circuit and G is a multigraph.
. There are d,, the number of derangements of {1, 2, 3, .. ., nh}.
. The vertices [in the line graph L(G)] determined by E’ form a maximal independent set.
Chapter 14
Rings and Modular Arithmetic
Section 14.1—p. 678
1. (Example 14.5): -a =a, —b =e,-c=d,-d=c,-—e=b
(Example 14.6): -s = s, -f = y, -v =x, -w=w,-x =v,-y=t
3. a) (a+b)+c=(b+a)+ec Commutative Law of +
=b4+(a+c) Associative Law of +
=b+(c+a) Commutative Law of +
b) d+a(b+c)=d+(ab+ac) Distributive Law of + over +
=(d+ab)+ac Associative Law of +
(ab+d)+ac Commutative Law of +
=ab+(d+ac) Associative Law of +
c) cld+b)+ab=ab+c(d +b) Commutative Law of +
=ab+(cd+cb) Distributive Law of + over +
=ab+(cb+cd) Commutative Law of +
(ab+cbh)+cd Associative Law of +
=(at+oab+ed Distributive Law of » over +
Solutions $-73
5. a) (i) The closed binary operation @ is associative. For all a, b, c € Z we find that
(a®b) @c=(a+b-1) @c=(a+b—-1lh+c-—l=atb+c-z?,
and
ageb@ecd=a9(b+e-—NYN=a4+Ob4+e-]l-l=at+b4+e-2.
(ii) For the closed binary operation © and all a, b, c € Z, we have
(aOb)Oc=(a+b-ab)Oc=(a+b-ab)+c-(at+b—ab)c
=a+bh-—ab+c-—ac~—bet+abe=a+b+c-—ab—ac—
be+abe,
and
aO(bOc)=a0 (b+c—be)=a+(b+ec— be) -—alb+e—be)
=at+b+c—bce-—ab-—ac+abe=a+b+c-—ab—ac—be+abe.
Consequently, the closed binary operation © is also associative.
(iii) Given any integers a, b, c, we find that
(bc) Oa=(b+c-1lOa=(b+c-1l)+a—-(b+c-—l)a
=b+c—1+a-—ba-—ca+a=atat+b+e~—1-—ba-ca,
and
(b6© a) ®(c Oa) = (b+a — ba) ® (Cc +a ~— ca)
=(b+a-—ba)+(ce+a-—ca)-l=at+atbhb4+c-—1-—ba—ca.
Therefore, the second distributive law holds.
c) Aside from 0 the only other unit is 2, since 2 © 2 = 2 + 2 — (2-2) = 0, the unity for
(Z, ®, ©).
d) This ring is an integral domain, but not a field. For all a, b € Z we see that a © b = 1 (the
zero element) > a+b-—ab=1>a(1—b)=(1-b) > (a-10-b)=O05a=l1lor
b = 1, so there are no proper divisors of zero in (Z, ®, ©).
7. From the previous exercise we know that we need to determine the condition(s) on k, m for
which the distributive laws will hold. Since © is commutative we can focus on just one of these
laws.
If x, y,z € Z, then
xO(y 82) = OY) (Xx Oz2)
—>xO(y+z2—k)=w+y—mxy)
B(x +2 —mxz)
s>xt+(iyt+z2—k)—mx(y+z72—-—k)
= (x+y —mxy) + (x +z-—mxz)—k
=>x+y+z2—-—k—mxy
—mxzt+mkx =x+y—mxy+x+z—mxz—k
=> mkx =x>mk=13>m=k=lorm=k=-—1, sincem,k €Z.
9. a) We shall verify one of the distributive laws. If a, b, c € Q, then
aQ(b@c)
=a (b+e4+7)
=at+(b+ec4+7)4+lab+e4+7)]/7
=at+b+e4+74
(ab/7) + (ac/7) +4,
while
(a©b) @(aOc)=(aOb+(aOce4+7
a+b+(ab/7)+a+e+4(ac/7)
+7
a+b+c+74
(ab/7) + (ae/7) +4.
Also, the rational number —7 is the zero element, and the additive inverse of each rational
numbera is —14 —a.
S-74 Solutions
c) Foreacha € Q,a=aQu=atuc (au/7) > u[l + (a/7)| = 0 > u = O, becausea is
arbitrary. Hence the rational number 0 is the unity for this ring. Now let a € Q, where a # —7,
the zero element of the ring. Can we find b € Q so that a © b = 0— that is, so thata + b+
(ab/7) = 0? It follows thata + b + (ab/7) = 0 => b(1 4+ (a/7)) = -a >
= (—a)/[{1 + (a/7)]. Hence every rational number, other than —7, is a unit.
11. b) 1, ~—1, i, i
13. |< A = (1/(ad — be)) |? "|a ad —be #0
15. a) xx =x(ft+y)=axt+uxy=r+y=x
yrH=(athtHxt+tt=t+t=s
yy =yRtx)= yl byx=sts=s
tx =(ytx)x =yx+xxHs4+x=x
ty=(ytx)y=yytxyasty=y
b) Since tx = x #7 = xt, this ring is not commutative.
c) There is no unity and, consequently, no units.
d) The ring is neither an integral domain nor a field.
Section 14.2—p. 684
. Theorem 14.10(a). If (S, +, -) is a subring of R, thena — b, abe S foralla, be S.
Conversely, since S # J, leta € 8. Thena ~a =ze€Sandz—a=-~aée S.Also, ifbe S,
then —b € S,soa — (—b) =a+ be S,and S is a subring by Theorem 14.9.
. a) (ab)(b-!a7') = a(bb")a™! = aua™! = aa! = u and (b-'a~!)(ab) = b'(a'a)b =
b- ‘ub = b-'b = u, so ab is a unit. Since the multiplicative inverse of a unit is unique,
(ab)! =6b"'a7!.
ff 2-7 , fa. -2 fo 4 -15
bat=[ 1 B= |) | By =| 2
-lo 16 —39 -la-le 4 —15~
(BA) [i | pea [_s st
» (-a)"! = -(a"')
wa
~-2ZES,TRBzESNTSSOT FO.a,bESAT>a,beSanda,beTa>at+babeS
anda+b,abeT=a+b,abeSnT.aeSnTsaeSandaeTs
—~aeéSand—aeT => -aeSnT.SoSNMT is asubring of XR.
. If not, there exist a, b € S witha € T,,a ¢ T, andb € To, b ¢ T,. Since S is a subring of R, it
follows thata + be S.Hencea+beT, ora+be Th.
Assume without loss of generality that a + b € 7,. Since a € T,, we have —a € 7), so by the
closure under addition in 7; we now find that (—a) + (a +b) = (-a+a)+b=bET\,a
contradiction. Therefore, SC 7; U7) > SCT, orSCh.
11.
loi] 9 [oo
d) S is an integral domain, while R is a noncommutative ring with unity.
13. Since za = z, it follows that z € N(a) and N(a) # W.Ifr,, m € N(a), then (7) —72)a =
ra —hd =z7—-z72=2,s0r, —m € N(a). Finally, ifr € N(a) ands € R, then (rs)a =
(sr)a = s(ra) = sz =z, sors, sr € N(a). Hence N(q) is an ideal, by Definition 14.6.
15. 2
17. a) ad=aueéaRsinceu € R,soaR # MV. If ar, ar2 € aR, then ar; — ar, = a(r; —7r2) € aR.
Also, for ar; € aR andr € R, we have r(ar,|) = (ar,)r = a(ryr) € aR. Hence aR is an ideal
of R.
b) Leta € R,a #z.Thenag = au eaR soaR = R. Since u € R= aR, u = ar for some
ré R,andr =a"!. Hence R is a field.
19, a) (5)(49) b) 7* — c) Yes, the element (u,u.u,u) d) 44
21. b) If & has a unity u, define a° = u, fora € R, a # z. Ifa isa unit of R, define a as (a~!)",
forne Zt.
Solutions §-75
Section 14.3-p. 696
. a) (i) Yes (ii) No (iii) Yes —b) (i) No (ii) Yes (iii) Yes
Cod het
. a) —6,1,8,15 b) —9,2,13,24 c¢) —7, 10, 27, 44
. Since a = b (mod n), we may write a = b + kn for some k € Z. And m|n > n = ém for some
£ € Z. Consequently, a = b+kn =b-4 (k€)m anda =b (mod m).
. Leta = 8,b = 2,m = 6, andn = 2. Then ged(m, n) = gcd(6, 2) = 2 > 1, a =b (modm) and
a =b(modn). Buta —b =8 —2=6 # k(12) = k(mn), for any k € Z. Hence
a #b(modmn).
. Forn odd consider the n — 1 numbers 1, 2, 3,...,2 —3,n—2,n—1as (n — 1)/2 pairs: 1
and (n — 1), 2 and (n — 2), 3 and (n —3),..., n — (25+) — 1Landn — (*5+). The sum of
each pair is n which is congruent to 0 modulo n. Hence )*"_| i = 0 (mod n). When n is even
we consider the n — 1 numbers 1, 2, 3,..., (2/2) — 1, (n/2), (n/2) 4+ 1,...,n-—3,n—-2,
n— las (n/2) — 1 pairs—namely, | andn — 1, 2 andn —2,3 andn —3,..., (#/2) — 1 and
(n/2) + 1 —and the single number (n/2). For each pair the sum is , or 0 modulo n, so
yr) i = (n/2) (mod n).
11. b) No,2%3 and3K5, but 5 FZ 8. Also, 2% 3 and2 KR 5, but 4 A 15.
13. a) [17]"' = [831] _b) [100]-' = [111] — e) [777]! = [735]
15. a) 16 units, 0 proper zero divisors _b) 72 units, 44 proper zero divisors
c) 1116 units, 0 proper zero divisors
17. [e) + 2(33) + eeyery'|/ (1900)
19, a) Forn = 0 we have 10° = 1 = 1(—1)°, so 10° = (—1)° (mod 11). [Since 10 — (—1) = 11,
10 = (—1) (mod 11), or 10! = (—1)! (mod 11). Hence the result is also true for2 = 1.] Assume
the result true for n = k > 1 and consider the case for k + 1. Then, since 10‘ = (—1)* (mod 11)
and 10 = (—1) (mod 11), we have 10*+! = 10* - 10 = (—1)‘(—1) = (—1)**! (mod 11).
The result now follows for all n € N, by the Principle of Mathematical Induction.
b) If xp X%_—1 °° X2X) Xp = xX, - 10" + x,_-,- 10" 1 +--+ +x.- 10? +x, - 10+ x9 denotes an
(n + 1)-digit integer, then
XnXn—1 0 X21 Xo = (HDX + (HL) api He x2 — 41 + Xo (mod 11).
Proof:
XpXpae 2 X2XpXo = X%y_- 10" +x, - 107! +---4+2n-10? +x, -104+ x9
= x,(—1)" + xy (- I! ++ + x2(-1)? + x1 (-1) + x0
= (-1)"x + (H1)"
xg He + x2 — xy + Xo (mod 11).
21. Let g = gcd(a, n), h = ged(b, n). [4 = b (mod n)] > [a = b+ kn, for some k € Z|] => [g|b
and hla]. [g|b and g|n] => gh; [h\a and h|n] = Alg. Since g, h > 0, it follows that g = A.
23. (1) Plaintext a £ € g a u € i s ad i v ti dee a
(2) 0 11 Il 6 O 20 11 8 18 3 8 21 8 3 4 «3
(3) 3 14 14 9 3 23 14 11 21 6 #11 24 11 6 7 «6
(4) Ciphertext D O O J D X O L V GL Y¥Y L G H G
i en t o t h ry e€@ @ p aor ts
8 13 19 14 19 7 17 4 4 #15 O 17 19 18
11 16 22 17 22 10 20 7 7F 18 3 20 22 21
L Q W R W K U H H § DU WwW YV
For each 6 in row (2), the corresponding result below it in row (3) is (6 + 3) mod 26.
25. a) (24)(8) = 192 ~—_—b) (25)(20) = S00 ~—se)s (27) (18) = 486 ~— dd) (30)(8) = 240
27, a) 9b) 10, 15, 2, 13, 11, 1, 8, 5,9
29. Proof: (By Mathematical Induction):
[Note that form > 1, (a” —1)/(a — 1) = a"! +a"? +--+++ 1, which can be computed in the
ring (Z, +, *)-]
S-76 Solutions
When n = 0, a°x9 + c[(a® — 1)/(a — 1)] = x9 + c[0/(a — 1)] = x9 (mod m), so the
formula is true in this first basis (1 = 0) case. Assuming the result for n (> 0) we have
Xn =a"xg + c[(a” — 1)/(a — 1)] mod m), 0 < x, < m. Continuing to the next case, we learn
that
Xn-) = ax, +c (mod m)
=ala"x9 + c[(a" — 1)/(a — 1)]] +c (mod m)
=a"t!yy + ac[(a” — 1)/(a — 1)] + e(a — 1)/(a — 1) (mod m)
=a"* xy) + c[(a"™*! —a +a —1)/(a — 1)] (modm)
=a"*|xy + c[(a"*! — 1)/(a — 1)] (mod m)
and we select x,,,) so that 0 < x,,) <m. It now follows by the Principle of Mathematical
Induction that
Xn =a"xg+cl(a" —1)/(a—1)] (modm), O<x, <m.
31. Proof: Letn, n + 1, and n + 2 be three consecutive integers. Then n? + (n + 1)°?4+(n+2) =
m+ (ne 4+3n?+3n4+ 1) 4+ (2 4+ 6n? + 12n + 8) = Bn’ + 15n) + 9(n? + 1). So we
consider 3n? + 15n = 3n(n? + 5). If 3|n, then we are finished. If not, then n = 1 (mod 3) or
n =2 (mod 3). Ifn = | (mod 3), then n? + 5 =1+4+5 =0 (mod 3), so 3|(n? +5). If
n = 2 (mod 3), then n? + 5 = 9 = 0 (mod 3), and 3|(n? + 5). All cases are now covered, so we
have 3|[n(n? + 5)]. Hence 9|[3(n* + 5)] and, consequently, 9 divides (3n? + 15n) +
924+ 1)=H4+n41)%4(m42).
n—-]
l 2
33. > p(k(nt+ 1), n,n) = il ") the nth Catalan number
ToD nt+l\n
35. a) 112 _—b) 031-43-3464
37. a) 1, 28, 14, 34,2,3(=241), 15 (= 144+1,4(=341) b) 1,2,3,4,5
Section 14.4—p. 704
~soOtrolvo2wo3,x5>4y>5
3. Let (R, +, +), (S, 8, ©), and (7, +’, -') be the rings. For alla, be R, (go fy(a +6) =
gs(f(at+b)) = g(f@) ® f)) = g(fla)) +’ g(f()) = (go f(a) +’ (g © f)(B). Also,
(go f)(a-b) = g(fla-b)) = g(f(@ © f(b) = g(f@) “ s(f()) =
(go f)(a) -’ (g o f)(b). Hence g o f is a ring homomorphism.
. a) Since f(z) = Zs, it follows that zg €¢ K and K # @.Ifx, ye K, then f(x — y) =
f(x +(-y)) = f(x) ® f(-y) = Ff) 8 FQ) = zs OZ = Zs, SOx — y € K. Finally, if
x €K andre R, then f(rx) = f(r) O f(x) = f(r) O7zs = zs, and f(xr) = f@yo f=
zs © f(r) = 25, sorx, xr € K. Consequently, K is an ideal of R.
b) The kernel is {6n|n € Z}.
. a)
xX (in Zz) | f(x) Gin Zy X Zs) |} x Gin Zo9) | fOr) Gin Z4 X Zs)
0 (0, 0) 10 (2, 0)
l (1, 1) 11 (3, 1)
2 (2, 2) 12 (0, 2)
3 (3, 3) 13 (1, 3)
4 (0, 4) 14 (2, 4)
5 (1, 0) 15 G3, 9)
6 (2, 1) 16 (0, 1)
7 (3, 2) 17 (1, 2)
8 (0, 3) 18 (2, 3)
9 (1, 4) 19 G3, 4)
b) (i) F(CA7)(19) + (12)(14)) = CL, 2), 4) + O, 2)(2, 4) = G, 3) + (0, 3) = G, 1), and
f'B.)=
Solutions S-77
9 . a)4 b) 1c) No
11. No. Z, has two units, while the ring in Example 14.4 has only one unit.
13. 397 + k(648),k EZ 15. 173 +kQI0),
kK EZ
Supplementary
Exercises—p. 708 1. a) False. Let R = Zand$=Z*. b) False. Let R = Zand S = {2x|x € Z}.
c) False. Let R = M2(Z) and S = ls 00 jaca]. d) True.
e) False. The ring (Z, +, +) is a subring (but not a field) in (Q, +, -).
f) False. For any prime p, {a/(p")|a, n € Z, n > 0} is a subring in (Q, +, -).
g) False. Consider the field in Table 14.6. hh) True.
.a) fat+a=(ataraa@t+at+at+a =(a+a)+(a+a]> la+a=2a =7].
Hence —a = a.
b) Foreacha € R,a +a =z=>a = —a. Fora, be R, (a+b) = (a+b) =
at+abt+bat+h? =a+ab+ba+b>5 ab+ba =z=> ab = —ba = ba, so R is
commutative.
. Since az = z = za foralla€ R, wehavezeC andC # VW. If x, y EC, then
(x+ ya =xa+ya=ax+ay=a(x+y), (xy)a = x(ya) = x(ay) = (xa)y = (ax)y =
a(xy), and (—x)a = —(xa) = —(ax) = a(—x), forallae R,sox+y,xy, -x EC.
Consequently, C is a subring of R.
- b) Since m, n are relatively prime, we can write 1 = ms + nt wheres, 1 € Z. With m, n > Oit
follows that one of s, t must be positive, and the other negative. Assume (without any loss of
generality) that s is negative so that 1 — ms = nt > 0.
Then a” = b" => (a")' = (b")! => a™ = b™ => alm = bl => aay = b(b™)—. But
with —s > O and a™ = b”, we have (a) = (b”)~". Consequently,
(ay? = bY # za [aay = be") > a = b,
since we may use the Cancellation Law of Multiplication in an integra] domain.
. Letx =a, +h,y=a + bo, for a), a € Aand bj, bs € B. Then x — y = (4, — a2) +
(b, —by)€A+B.fre Randa+beA+B,witha €Aandbe B,thenrace A,rbe B,
andr(a+b)¢ A+B. Similarly, (a + b)r ¢ A+ B,and A + B is an ideal of R.
11. Consider the numbers x, x; +.%2, X) +X. +.2%3,...,%; $42 + x3 4+-+-+++-x,. If one of these
numbers is congruent to 0 modulo n, the result follows. If not, there exist | <i < / <n with
(xy $x eee $x) Sy tee +x, Hx 4) +--+ +-%,) (mod n). Hence n divides
Qua tee +3,).
13. a) 1875 ~b) 2914s ¢) 3/16
15. Proof: For all n € Z we find that n? = 0 (mod 5) — when 5|n — or n? = 1 (mod 5) or
n? = 4 (mod 5). Suppose that 5 does not divide any of a, b, or c. Then
(i) a? +b? +c? =3 (mod 5)— whena’? =}? =c’? = 1 (mod 5);
(ii) a? + b? +c? = 1 (mod 5) — when each of two of a’, b’, c? is congruent to 1 modulo 5
and the other square is congruent to 4 modulo 5;
(iii) a? + b* + c* = 4 (mod 5) — when one of a’, b’, c? is congruent to | modulo 5 and each of
the other two squares is congruent to 4 modulo 5; or
(iv) a? +b? +c? =2 (mod 5)— when a’ = b* =c’ = 4 (mod 5).
17. (¢, + I){e2 +1)---(&+1) -1
Chapter 15
Boolean Algebra and Switching Functions
Section 15.1—p. 718
1. a) | b)1 el dil 3. a) 2” b) 22”
5, a) dnf. XYZ tXVZAXVZT
+ XYZ + XYZ
c.nf. @&+yt+2e+y+Da+y¥t+zZ)
S-78 Solutions
b) f=
> m(2. 4.5.6.7) =|] MOO, 1,3)
. a) 2% =) 2°) 28
~mtk=2" ll. a) y+uz% Db x+y Cc) wxetz
13. a) (i) = = =
f|el|Al fe | fh | eh | fet+fh+eh | fetfh
0010/0] 0 0 | 0 0 0
olo/l1]o 1 0 1 1
0 l 0 0 0 0 0 0
oli/1i]o0 1 1 1 1
1}o]o!} o nO) 0 0
1/o/11] 0 0 | 0 0 0
1/1/o0] 1 0 0 1 1
1/1/}/1] 1 0 1 1 1
Alternatively, fg + Fh = (fe + f(fg +h) = (F + P(e + fA(fe +h) = -
lig + fo(fg +h) = feg+eht+
ffet fh=fe+gh+0g+
fh= fe+eh+ fh.
qi) fet+ fe+fe+fe=fige+a+fgeta=f-1l+f-l=f+f=i1
b @ (f+a(f+herh)=(f+af +h)
Gi) (f+ 9 +OF+9F +H =0
15. a fef=0; fOof=1 fel=f; feo-f
b) @ f@g=0 fE+ fg =0= fE= fg =0.[f = land fz=0|=>¢=1.
[f =Oand fg = 0] = g = 0. Hence f = g.
ai) fegs=fe+fs=fetfe=fe+fe=feeg
(iv) This is the only result that is not true. When f has value 1, g has value 0 and # value |
(or g has value | and # value 0), then f @ gh has value 1 but (f @ g)(f @h) has
value 0.
(v) fe ® fh = fefh+ fefh =(F+afh+ feF +h) =
ffA+ feht+ ffet fah= feh+ feh= fh + gh) = f(g @h)
(vi) fOg=fer+fs=fs+fe=feog
fes=fst+fe=(f+stf+a=fe+fe=f
eg
Section 15.2—p. 727
-a)x@y=(e+y)GyY) y——
x@y
x —
b) xy TD) [> wy Oxy ; > oy
» f(w, x,y, 2) = wxyzt(w+xt+y)z
a
x
ea
x
(a) (b)
a) The output is (x + y)(x + vy) + y. This simplifies tox + Gy) +y=x+0+y=x4+y,
and provides us with the simpler equivalent network in part (a) of the figure.
Solutions S-79
b) Here the output is (x + ¥) + @ ¥+ y), which simplifies tox ¥+X7+y =
xXy+xyt+y=xt+y)+y=x0)+y =X +4 vy. This accounts for the simpler
equivalent network in part (b) of the figure.
~ a) f(w.x,yy=xy+xy Db) f(w.x.y)=x c) f(w, x,y,z) = xXZ4-
XZ
d) f(w, x,y,z) =wyZ+xyztwyzt+xyzZ e) f(w,x, ¥,z) = wy + wxz + xyz
f) f(Qv, wi. x,y, 2) = VWXYZ+ vwXZ
+ UXVZ + WZ + UWy + Vz
11. a})2 b>3 cf 4 ak 1
13. a) |f 'O|=|f MI =8 — b) | f7'O)| = 12, |f -'d)| =4
e) |f-'O)| = 14, |f-'G)|=2 — d) |f-'()| = 4, |f "| = 12
Section 15.3—p. 733
© HUE WY TUXZ +UVZ + WZ
—"
a) f(w,x,y,z)=z b) f(w,x,y, 2) =xXVT+xy2+
+ xy
xyz
Z) = vyZ+Wxyz + DWE
Cc) fr, wi x,y,
. {b, d}, {c, d}, {d, th, {a, 8}. {e, th, {b, e}, {c, e}, {a, th. {b, 2}, {e, g}
Section 15.4—p. 741
. a) 30 ~=~b) 30~— ce):1~—s dd) 21~—s eg) 30~—séf'*):s« 70
.a) w<0>w-0=w. But w-0 =O, by part (a) of Theorem 15.3.
c) y<z=>yz=y,and y <Z => yz = y. Therefore, y = yz = (yzZ)z = y@z) = y -0=0.
~VYS<Kx
. From Theorem 15.5(a), with x), x2 distinct atoms, if x;x. # 0, then x) = x)x2 = x2x) = x,
a contradiction.
11. a) f(0) = f(xx) for eachx € B). f(xx) = fx) fH) = f@)f) = 0.
b) Follows from part (a) by duality.
Cc) x<yeexyaxs fay) = fas fas) = fa) = fa) < fQ)
13. a) faxy)= f+ = fF =fOTIO =F@-FH =
f£®)- fH =f): £6)
b) Let 28), 2B» be Boolean algebras with f: 9%, —* %> one-to-one and onto. Then f is an
isomorphism if f(x) = f(x) and f(xy) = f(x) f(Q) for all x, y € B,. [Follows from part (a)
by duality.]
15. For all 1 <i <n, (x; +x. +-++ 4-4,)x, = HX, + XOX, ee tH, X, £ HX, +X 1X, +
seo tx,x, =04+04+---4+04%; +0+---+0 = x,, by part
(b) of Theorem 15.5.
Consequently, it follows from Theorem 15.7 that (x; + x2 +--+ +.24%,)x = x forall x eR.
Since the one element is unique (from Exercise 10), we conclude that 1 = x; +x. +---+X,.
Supplementary
Exercises —p. 743 . a) Whenzn = 2, x; + x» denotes the Boolean sum of x, and x». For n > 2, we define
Xy + XQ +++ +4 Xn + Xn41 recursively by (x; + x2 +--+ +X,) + Xn41. (A similar definition
can be given for the Boolean product.) For n = 2, x; + x» = X)X2 is true; this is one of the
DeMorgan Laws. Assume the result for n = k (> 2) and consider the case of nm =k 4+ 1.
(xp x2 te + Xe + Xe41) = CH +2 + + Xe) + XH
= (KX) + Xo ++
+ XK) XG
= Xp X20
XK XK
Consequently, the result follows for all n > 2, by the Principle of Mathematical Induction.
b) Follows from part (a) by duality.
. She can invite only Nettie and Cathy.
. [fx <zand y <z, then from Exercise 6(b) of Section 15.4 we have x + y < z +z. And by the
Idempotent Law we have z + z = z. Conversely, suppose that x + y < z. We find that
x <x + y, because x(x + y) = x + xy (by the Idempotent Law) = x (by the Absorption Law).
Since x <x + y andx + y <z, we have x < z, because a partial order is transitive. (The proof
that y < z follows in a similar way.)
S-80 Solutions
~-apx<yox4¢x<y4exe>o>l<yt+x>o>yv4+X=x4+y
= 1. Conversely,
X¥+ty=lox«®t+y)=x-laxx(=ODt+xy
=x >SpxyH=xoax<y.
b) x <V¥oxVH=x
SB xy = OY)y = xy) = x -0=0. Conversely,
xy=O5x=x-l=x(yt+y)=H=xytxy=xyands =xy ox <y.
9% a) f(w,x,y,z)=wxt+xy Dd) gv, w, x,y,z) = VwWyz+xz+wyzT+xyZ
1. a) 22°) 24; 2"!
13. a) If = 60, there are 12 divisors, and no Boolean algebra contains 12 elements since 12 is not
a power of 2.
b) If nm = 120, there are 16 divisors. However, ifx = 4, then xX = 30 and x - x = ged(x, X) =
ged(4, 30) = 2, which is not the zero element. So the Inverse Laws are not satisfied.
Chapter 16
Groups, Coding Theory, and Polya’s Method of Enumeration
Section 16.1 —p. 751
. a) Yes. The identity is 1 and each element is its own inverse.
b) No. The set is not closed under addition and there is no identity.
c) No. The set is not closed under addition.
d) Yes. The identity is 0; the inverse of 107 is 10(—n) or —10n.
e) Yes. The identity is 14 and the inverse of g: A> Ais g7!: A> A.
f) Yes. The identity is 0; the inverse of a/(2”) is (—a)/(2").
. Subtraction is not an associative (closed) binary operation for Z. For example, (3 — 2) — 4 =
3 #5=3-(2-4).
. Since x, yEeZ=>x+y+1 € Z, the operation is a closed binary operation (or Z is closed
under 0). Forallw, x, yEeZwo(xoy)=wotetyt+)l=wt+oatydt+lI4+l=
(w+x+1)+y¥+1= (wox)o y,so the binary operation is associative. Furthermore,
xoy=x+ty4+l=y+x+1=yox, forall x, y € Z, soo is also commutative. Ifx € Z,
then x o (-1) = x + (~—1) + 1 = x[= (—1) ox], so —1 is the identity element for o. And
finally, for each x € Z, we have —x —2 € Zand xo (—x —2) =x+(-x -—2)+1=
—1 [= (—x — 2) 0x], so —x — 2 is the inverse forx under o. Consequently, (Z, o) is an abelian
group.
« Urg = {1, 3, 7, 9, 11, 13, 17, 19} Ung = {1, 5, 7, 11, 13, 17, 19, 23}
. a) The result follows from Theorem 16.1(b) because both (a~!)~! and a are inverses of a~!. 1
b)
(b-'a7)(ab) = b7 (a7! a)b = bo '(e)b = bb = e and
(ab)(b-!a~!) = a(bb")a! = aleja“! = aa! =e
So b-'a~! is an inverse of ab, and by Theorem 16.1(b), (ah)~! = bo!a7}.
11. a) {0}; {0, 6}; (0, 4, 8}; {0, 3, 6, 9}; (0, 2, 4, 6, 8, 10}; Zip
b) {1}; {1, 10}; {1, 3, 4,5, 9} ZI
C) {to}; {%o, 11, Wa}; (70. ri}; (70, Pr}: (Wo. 73}s S3
13. a) There are 10: five rotations through i(72°), 0 <i < 4, and five reflections about lines
containing a vertex and the midpoint of the opposite side.
b) For a regular n-gon (n > 3) there are 2” ngid motions. There are the n rotations through
i(360°/n),0 <i <n — 1. There are n reflections. For n odd, each reflection is about a line
through a vertex and the midpoint of the opposite side, For n even, there are n/2 reflections
about lines through opposite vertices and n/2 reflections about lines through the midpoints of
opposite sides.
15. Since eg = ge for all g € G, it follows thate €¢ H and H # @. If x, y ¢ H, thenxg = gx and
yg = gy for all g € G. Consequently, (xy)g = x(yg) = x(gy) = (xg)y = (gx)y = g(xy) for
all g € G, and we have xy € H. Finally, foreach x € H, g € G,xg-! = g7!x. So
(xg7!)7! = (g7)x)7!, or gx! = x7!g, and x7! € H. Therefore, H is a subgroup of G.
17. b) (i) 216
Solutions S-81
(ii) A, = {(x, 0, 0)|x € Ze} is a subgroup of order 6
Ay = {(x, y, 0)|x, y € Ze, y = 0, 3} is a subgroup of order 12
A; = {(x, y, 0) |x, y € Ze} has order 36
(iii) —(2, 3, 4) = (4, 3, 2): -(4, 0, 2) = (2, 0, 4); -G, 1, 2) = 5,4)
19. ax=1lxe=4 bx=1,x«=10
c) x =x! ->x* =1 (mod p) > x? — 1 =0 (mod p) > (& — I(x+ 1) = 0 (mod p) >
x — | =0 (mod p) or x + 1 =0 (mod p) > x = 1 (mod p) or x = —1 = p — 1 (mod p).
d) The result is true for p = 2, since (2 — 1)! = 1! = —1 (mod 2). For p > 3, consider the
elements 1,2,..., p—lin (Zi, -). The elements 2, 3,..., p — 2 yield (p — 3)/2 pairs of the
form x, x~!. (For example, when p = 11 we find that 2, 3, 4, ..., 9 yield the four pairs 2, 6;
3, 4, 5, 9: 7, 8.) Consequently, (p — 1)! = (1)(1I)" 9? (p — 1) = p —1 = —1 (mod p).
Section 16.2—p. 756
»b) f(a"): f@ = f@"'-a) = flec) = en and f(a): f(a!) = fla-a) = f (eg) = en,
so f(a~') is an inverse of f(a). By the uniqueness of inverses (Theorem 16.1b), it follows that
fia')=([f@r!.
~fO= 0,0) fd=da,) f@)= (2,9)
fH=O0D fA=0,0) fG)=@,1)
» £(4, 6) = Sg) + 32>
Wn
- a) 0(7t0) = 1, 601) = ear) = 3, o(71) = o(F2) = (rs) = 2
b) (See Fig. 16.6) 6(79) = 1, e(,) = e(13) = 4, (m2) = €(r)) = 6(r2) = (73) = O(r4) = 2
. a) The elements of order 10 are 4, 12, 28, and 36.
11, Zs = (2) = (3): ZF = (3) = (5); Zh = (2) = (6) = (7) = (8)
13, Let (G, +), (H, *). (K, +) be the given groups. For allx, yé G, (go f)(x +y) =
g(f(x + y)) = 8) * FO) = (8 O@))) - (g(FO))) = (go F(X) - (Cg © f)Q)), since
f. g are homomorphisms. Hence, g o f: G > K is a group homomorphism.
15. a) (Zi2, +) = (1) = (5) = (7) = (11)
(Zie, +) = (1) = (3) = (5) = (7) = (9) = (11) = (13) = (15)
(Zo, +) = (1) = (5) = (7) = (11) = (13) = (17) = (19) = (23)
b) Let G = (a*). Since G = (a), we have a = (a*)’ for some s € Z. Then a!-* = e, so
| —ks = tn since o(a) =n. 1—ks =tn=> 1 =ks +tn => ged(k, n) = 1. Conversely, let
G = (a) where a‘ & G and gcd(k, n) = 1. Then (a*) C G. ged(k, n) = 13 1=ks +1, for
some s,t€Z=>a =a! = ak" = (a*)*(a")! = (a*)*(e)' = (a*y’ € (a*). Hence G C (a*). So
G = (a*), or a* generates G.
c) p(n).
Section 16.3—p. 758
lea) {(2 3 3 76 9 7 3).G 7 3 3)0 3 3 DO}
biG i 4 )F={G 3 7 0 33 70 33 9.6 7 2 DI
G23 DF={(0 3 4 )0 97 9673 9G 3 3 D}
(24 )47={0 3 7 6 42 2G 7 3 90 3 3 DJ
(393 J¥={G 93 1).G 7 4 )0 3 7 9.0 3 3 OD}
(G42 J4={0 7 3 3.63976 3 7 9-0 4 3 9}
(| 3 3 jH=H4
3. 12
5. From Lagrange’s Theorem we know that |K| = 66 (= 2-3 - 11) divides || and that | H|
divides |G| = 660 (= 27-3-5- 11). Consequently, since K # H and H # G, it follows that
|| is 2(2-3-11)= 132 or 8(2- 3-11)= 330.
.a) Lete=({ 3 3 d,a=(6 7 3 9.6=() 2 3 4d ands=() 3 3 4),
$-82 Solutions
It follows from Theorem 16.3 that H is a subgroup of G. And since the entries in the
accompanying table are symmetric about the diagonal from the upper left to the lower right, we
have H an abelian subgroup of G.
b) Since |G| = 4! = 24 and |H| = 4, there are 24/4 = 6 left cosets of H in G.
c) Consider the function f: H — Zs X Z, defined by
fe) = (0, 0), f(a) = (1, 0), f(B) = ©, 1), f(6) = , I.
This function f is one-to-one and onto, and for all x, y € H we find that
f(xy) = fx) B f(y).
Consequently, f is an isomorphism.
(Note: There are other possible answers that can be given here. In fact, there are six possible
isomorphisms that one can define here.)
. a) If H is a proper subgroup of G, then by Lagrange’s Theorem, || is 2 or p. If |H| = 2, then
H = le, x} where x? = e, so H = (x). If |H| = p, lety © H, vy # e. Then c(y) = p, so
H = {y).
b) Let x € G, x # e. Then «(x) = p ore(x) = p?. If (x) = p, then |(x)| = p. Ife(x) = p’,
then G = (x) and (x”) is a subgroup of G of order p.
11. b) Letx ¢ HO K. If the order ofx is r, then y must divide both m and n. Since ged(m, n) = 1,
it follows thatr = 1,sox =e and HN K = {e}.
13. a) In (Z*, -) there are p — 1 elements, so by Exercise 8, for each [x] € (Z*, -), [x]?~! = [1],
or x?! = 1 (mod p), or x? =x (mod p). For all a € Z, if p| a, then a = 0 (mod p) and
a’ =0Q=a (mod p). If p / a, thena = b (mod p) where 1 <b < p — 1, and
a? = b? =b =a (mod p).
b) In the group G of units of Z,, there are ¢(n) elements. If a € Z and gcd(a, n) = 1, then
[a] € G and [a]? = [1] or a? = 1 (mod n).
¢) and d) These results follow from Exercises 6 and 8. They are special cases of Exercise 8.
Section 16.4—p. 761
. 0462 O170 1809 0462 1809 1981 0305
—
. DRIVESAFELYX §. p = 157, ¢ = 773
Section 16.5—p. 765
. a) e= 0001001 ~=b) r=1111011 = ¢) c = 0101000
bod
» a) (i) D(111101100) = 101 qi) D(000100011) = 000
(iii) D(O1O011111) = 011
b) 000000000, 000000001, 100000000 sc) -64
Sections 16.6 and 16.7-
p. 772 ~ S(101010, 1) = {101010, 001010, 111010, 100010, 101110, 101000, 101011}
SCULI111, 1) = {111111, OVLI11, 101111, 110111, 111011, 111101, 111110}
~ a) |S¢x, 1)| = 11; |S, 2)| = 56; |S(x, 3)| = 176
b) S@,H)=14+() +6) +--+) = Uh OF)
. a) The minimum distance between code words is 3. The code can detect all errors of weight <
2 or correct all single errors.
b) The minimum distance between code words is 5. The code can detect all errors of weight <
4 or correct all errors of weight < 2.
Solutions 5-83
c) The minimum distance is 2. The code detects all single errors but has no correction
capability.
7. a) C = {00000, 10110, 01011, 11101}. The minimum distance between code words is 3, so the
code can detect all errors of weight < 2 or correct all single errors.
1 0 1 0 0
bo H=;1 1 0 1 =0
0 10 0 =1
c) (i) Ol (ii) 11 (v) 11 (vi) 10
For (iii) and (iv) the syndrome is (111)", which is not a column of H. Assuming a double error,
if (111)" = (110)" + (001)", then the decoded received word is 01 [for (iii)] and 10 [for (iv)]. If
(111)" = (O11)" + (100)", we get 10 [for (iii)] and 01 [for (iv)].
9. G = [/g|A] where /, is the 8 X 8 multiplicative identity matrix and A is a column of eight 1’s,
H = [A™|1] = [11111111]1].
11. Compare the generator (parity-check) matrix in Exercise 9 with the parity-check (generator)
matrix in Exercise 10.
Sections 16.8 and 16.9-
p. 779 1, (75°); 255
3. a) Syndrome Coset Leader
000 00000 10110 =O1011l =11101
110 10000 00110 11011 01101
O11 01000 11110 QOO11 10101
100 00100 10010 Ol1111 11001
010 00010 10100 01001 11111
001 00001 10111 O1010 11100
101 11000 01110 10011 00101
111 01100 11010 OO11L = 10001
(The last two rows are not unique.)
b) Received Word Code Word Decoded Message
11110 10110 10
11101 11101 11
11011 01011 01
10100 10110 10
10011 01011 01
10101 11101 1]
11111 11101 11
01100 00000 00
. a) Gis57
X 63; His6 X63 b) The rate is 2.
» a) (0.99)’ + (7)(0.99)°(0.01) —_-b) [(0.99)7 + (7)(0.99)°(0.01)P
Section 16.10—p. 784
a) a*= Ci Cy C3 Cy Cy Cy Cr Cg Cy Cro Cu Cire Cis Cia Cis Cie
" 2 Cy Cy Cs Cz Cz Cg Co Co Cp Cro Cu Cia Cis Cir C13 Cie
pe {Ol Cp C3 Cy Cs Co C7 Cg Co Cro Cu Ciz C13 Cia Cis Cie
4 C; Cy C3 Co Cs Co Cg Cz Co Cro Cu Ciz2 Cis Cia C13 Cis
b) (w)*= Cy Cy C3 Ca Cys Co Cr Cg Co Cro Cn Cre C13 Cia Cis Cre
Cy) Cs Cy Cz Cy Co Co C7 Cg Cu Cio Cis Ci2 Cis Cia Cie
= (at)
¢) merk = C; Co C3 Cy C5 Co C7 Cg Co Cio Cu Cir2 Ci3 Cia Cis ci)
a4 Ci Cs Cy C3 Cy Cy Co Cg Cp Cu Cw C3 Cr Cis Cra Cio
= (m3r4)*
S-84 Solutions
+
. a) ofa) =7; e(6)=12; efy) =3: 6) =6
b) Leta € S,, with a = c)c) +--+ cy, a product of disjoint cycles. Then «(@) is the lcm of
€(c)), £(c2),..., £(c,), where £(c,) = length of c,, for l <i <k.
. a) 8 ~~ b) 39 7. a) 70 b) 55
. Triangular figure: a) 8 b) 8 Square figure:a) 12 b) 12
. a) 140 b) 102 13. 315
Section 16.11 —p. 788
. a) 165 ~b) 120
. Triangular figure: a) 96 _b) 80
Square figure: a) 280 —_b) 220
Hexagonal figure: a) 131,584 —_b) 70,144
- a) 2635 ~—b) :=1505
Cc) R R
B Y Y B
G Ww W G
R R
- a) 21 ~b) 954
c) No: k = 21 andm = 21,sokm = 441 # 954 = n. Here the location ofa certain edge must
R W W
eo
be considered relative to the location of the vertices. For example, | W is not equivalent to
—_~
Ww WB
R W W R Ww R Ww Ww w
o—_*
Wj R even though is equivalent to and R W is equivalent to W | R,
«_ -- e—-
Ww W B Ww B Ww B Ww Ww
Section 16.12—p. 793
. a) (i) and (ii) r4 + wt + row + 2r?ew? +rw?
b) (i) C/H[r +b4+w)* +2074 4+ 644 0 4+ (7? 4+ + w’)?]
(ii) (1/83)[@ +6 +w)t + 2(r4 + bt + wt) 4+ 30? +b? 4+ wy?
+20 +b+w)(r? +b? 4+w’)]
. a) 10
b) (1/24)[(r + w)® + 60 + wr? + wt) + 307 + we)? (r? + w*)? + 607? + Ww?)
+ 8(7? + w)?]
c) 2
. Let g = green and y = gold.
Triangular figure: (1/6)[(g + y)* + 2(g + y)(g? +9) + 3g + P(g? + YD]
Square figure: (1/8)[(g + y)° + 2(g + y)(g* + y*) + 3(g¢ + (ge? +’)?
+ 2(g + y)?(g* + y*)]
Hexagonal figure: (1/4)[(g + y)” + 2(g + y)(g? + y)* + (g + vy) (g? + 07]
. a) 136 +b) (1/2 + (7? + w’)4]
)[(r
+ w)8 se) 38: 16 9, (m+n)
Supplementary
Exercises—p. 797 1. a) Since f(ec) = ey, it follows that e; €¢ K and K #4. Ifx, y € K, then f(x) = f(y) = en
and f(xy) = f(x) f(y) = even = en, soxy € K. Also, forx € K, f(x7') =[f()]"! =
e;;' =ey,80x~' € K. Hence K is subgroup of G.
b) Ifx ¢ K, then f(x) = ey. Forallg €G,
flexg') = figfaf(g') = f(genflg') = f(g fle”) = flee) = flec) = en.
Hence, for allx € K, g € G, we find that gxg! EK.
. Leta, be G. Thena’h* = ee = e = (ab)? = abab. But a*h? = abab > aabb = abab>
ab = ba, so G is abelian.
. Let G = (g) and leth = f(g). If h, € H, then h, = f(g") for some n € Z, since f is onto and
G is cyclic. Therefore, h; = f(g") =[f(g))" = h", and H = (h).
Solutions $-85
7. For alla,
be G,
(aoca)ob!ob=bob!olfa!oays
aca 'ob=boa'!oasacbh=boa,
and so it follows that (G, 0) is an abelian group.
. a) Consider a permutation o that is counted in P(n + 1, k). If (2 + 1) is a cycle (of length 1)
in o, then o (restricted to {1, 2, 3, .... }) is counted in P(n, k — 1). Otherwise, consider each
permutation t that is counted in P(n, k). For each cycle of t, say (a, -- - a,), there are r
locations in which to place n + 1—(1) between a, and a; (2) between a2 and a3; ...; (r — 1)
between a,_, and a,; and (r) between a, and a,. Hence there are n locations, in total, to locate
n+ lint. Consequently, P(n + 1,k) = P(n,k —1)4+nP(n, k).
b) >i , P(n, k) counts all of the permutations in S,, which has n! elements.
11. a) Suppose that 7 is composite. We consider two cases.
(1) n=m-r,wherel <m <r <n: Here (n —1)!=1-2---(m—1)-m-(m4+1)---
(ry —1l)-r-(r +1)---(— 1) = 0 (mod nz). Hence (n — 1)! # —1 (modn).
(2) n = q?, whereq is a prime: If (n — 1)! = —1 (modn) then 0 =g(n — 1)! =q(-D=
n — q £0 (mod n). So in this case we also have (n — 1)! # —1 (mod n).
b) From Wilson’s Theorem, when p is an odd prime, we find that
~l=(p—D!=(p—3)"p —2)(p-— 1) = (p —3)(p? — 3p + 2) = 2(p — 3)! (mod p).
Chapter 17
Finite Fields and Combinatorial Designs
Section 17.1—p. 806
~ fix) + a(x) = 2x4 45x73 44°4+5
f (x) g(x) = 6x7 + 2x84 3x9 44x44 2x7 +4? 44x44
(10)(11)?; (10)(11)*; (10) (11)*; (10)(11)”
a)and b) f(x) = (x? +4)(x — 2)(x + 2); the roots are + 2.
c) f(x) = (x +21)(x — 21)(x — 2)(x 4+ 2); the roots are + 2, + 2/.
d) (a) f(x) = (x* — 5)(x? +5); there are no rational roots.
(b) f(x) = (x — V5)(x + V5)(x* +5); the roots are + JS.
(c) f(x) = (x — V5) + V5)(x — V5i)(x + VSI); the roots are + /5, +i JS.
. a) f(3)= 8060 b) f=1l oc) f(-9) = f2) =6
11. 4,6; p—1
13. Let f(x) = 30", 4@,x' and h(x) = 0 *_, b)x', where a, € R forO <i <™m, b, € R for
0<i<k,andm <k. Then f(x) +h(x) = Do*_(a, +.b))x', where dys) = Gmy2 = ++ =
a, = z, the zero of R, so G(f (x) +A(x)) = G (Kola, + bx") = Vg lai +b)x! =
*_glg(a;) + a(b x! = Shy glax' + Vihy 8)! = GF) + G(A)). Also,
f(x)h(x) = an c;x', where c, = a,bo +.a,-1b; +--+ +a ,b,-) + agb,, and
m+k m-+k
G(f(x)A(x)) = G (> ox — > g(c, )x'.
i=0 1=0
Since g(c,) = g(a,)g(bo) + g(@-i)g(>) +> ++ + garg (b,-1) + 2 (ao) g(b;), it follows that
m+k m k
dD atc)x' = (x coos’ (>: cero’ = G( f(x) G(h(x)).
1=0 1=0 1=0
Consequently, G: R[x] > S[x] is a ring homomorphism.
15. In Za[x], (2x + 1)(2x + 1) = 1, so (2x + 1) is a unit. This does not contradict Exercise 14
because (Z4, +, +) is not an integral domain.
17. First note that for f (x) = a,x" + a,_)x"~| + +++ + anx? +a1x +49, we have dy, + dy-1
+-->+a) +a, +d = Oif and only if f(1) = 0. Since the zero polynomial is in S, the set S$ is
S-86 Solutions
not empty. With f(x) as given here, let g(x) = b,x” + by, oyx™ | +e ++ box? + bx +
bo € S. (Here m <n, and form <n we have by4) = bay2 = +-- = b, = 0.) Then
fd) ~— g) =0-0=0, so f(x) — g(x)eS.
Now consider h(x) = )°*_, r;x' € F[x]. Here h(x)
f (x) € F[x] andh(1) f() =
h(1)-0=0, so h(x) f(x) € S.
Consequently, S is an ideal in F [x].
Section 17.2—p. 813
. a) x° + 3x — 1 is irreducible over Q. Over R, C,
x? 43x—1= [x —((-3 + V13)/2)]Lx — (—3 — V13)/2)].
b) x* — 2 is irreducible over Q.
Over R, x4 —2 = (x — ¥2)(x + V2)(x? + V2);
x4 25 (x — V2)x + V2)(a — V2i)(@ + Y2i) over C.
ce) x7 4+x4+1 = (« + 2)(x +2) over Z3. Over Zs, x7 + x + 1 is irreducible; x? + x +1 =
(x + 5)(« + 3) over Z,.
d) x*+.x° + 1 is irreducible over Z>.
e) x° + 3x* — x +1 is irreducible over Zs.
. Degree 1:x;x+1 Degree 2:x*+x+1 Degree 3:34 x7 +1; 93 +241 5. 7°
Land
. a) Yes, since the coefficients of the polynomials are from a field.
b) h(x) |f(@), gx) > f(x) = ACv)u(a), g(x) = h(x)v(x), for some u(x), v(x) € F[x].
m(x) = s(x) f(x) + t(x) g(x) for some s(x), f(x) € F[x], so
m(x) = A(x) [s(x)u(x) + t(x)v(x)] and h(x) | (x).
c) If m(x) J f(x), then f(x) = g(x)m(x) + r(x), where 0 < deg r(x) < deg m(x).
m(x) = s(x) f(x) + t(x)g(x) sore) = f(x) — gis) fOr) + t(x)gix)]
= (1 —qQx)sQyf) — g@)t@)gx), sore S.
With deg r(x) < deg m(x) we contradict the choice of m(x). Hence r(x) = 0 and m(x)| f(x).
. a) The ged is (x— 1) = (1/17) (0° — x4 4+ x3 +x? —x - 1)
— (1/17) (x? + x — 2)(x3 — 2x? + Sx — 8).
b) The gcd is 1 = (4+ DO4742°4+1D4 07427? 4a)? +241).
c) The gcd is x? + 2x +1 = (x4 + 2x7 42% +2) + (x + 2)(2x7 + 2x? + % +1).
11. a=0,b=0;a=0,b=1
13. a) f(x) = fi(x) (mod s(x) > f(x) = fi(x) +A(x)s(x), for some h(x) € F [x], and
g(x) = gi (x) (mod s(x)) => g(x) = gi (x) + k(x) s(x), for some k(x) € F[x]. Hence f(x) +
B(x) = fie) + Bi) + AQ) + k(x) 5(X), So F(X) + 80x) = fi) + 81 (0) (mod s(x)), and
fingix) = fia) + CA )K (x) + gi (h(x) + hk (x)s(x)) s(x), so f(x) gQ) =
fix)gi(x) (mod s(x)).
b) These properties follow from the corresponding properties for F [x]. For example, for the
distributive law,
[Fg] + FAQ) = [LF@)][g@) +40)] = (FO) (g@) +h@))]
= [f@)8@) + fA) = [Fg] + [F@)A)]
= [fx] [g@)1 + (FQ) TA@)I.
d) A nonzero element of F[x]/(s(x)) has the form [f(x)], where f(x) # 0 and deg f(x) <
deg s(x). With f(x), s(x) relatively prime, there exist r(x), ¢(x) with | = f(x)r(x) + s(x)t(x),
so 1 = f(x)r(x) (mod s(x)) or [1] = [f@)][7@)]. Hence [r@v)] = [f@)I'.
e) gq”
15. a) [2x +1] b) [2x41] oc) [2x] 17. a) p"” ~~ »b) o(p*- I
19. a) 6 b) 12 ec) 12~ dj) icm(m,n) ee) O
21. 101, 103, 107, 109, 113, 121, 125, 127, 128, 131, 137, 139, 149
Solutions S-87
23. For s(x) = x? + x? + x +2 € Z;[x] one finds that s(0) = 2, s(1) = 2, and s(2) = 1. It then
follows from part (b) of Theorem 17.7 and parts (b) and (c) of Theorem 17.11 that Z3[x]/(s(x))
is a finite field with 3° = 27 elements.
25. a) Since 0 = 0+ 0V2 € Q[ V2], the set Q[2] is nonempty. For a + bV/2, c +dV2€ Q[V2],
we have
(a + b/2) — (c+ dV2) = (a—c) + (b— a) V2, with (a —c), (b —d) €Q; and
(a + bV2)(c + dV/2) = (ac + 2bd) + (ad + be)V2, with ac + 2bd, ad + be EQ.
Consequently, it follows from part (a) of Theorem 14.10 that Q[V2] is a subring of R.
b) To show that Q[./2] is a subfield of R we need to find in Q[./2] a multiplicative inverse for
each nonzero element in Q[/2]. Leta + bV/2 € Q[ V2] witha + bV/2 # 0. Ifb = 0, thena # 0
anda! € Q— anda! +0: /2€ Q[ V2]. For b # 0, we need to find c + dV2 € Q[V2]
so that
(a + bV2)(c + dV2) = 1.
Now (a + bV2)(c + dV2) = 1 = (ac + 2bd) + (ad + be) V2 = 1 => ac + 2bd = 1 and
ad + bc = 0 ¢ = ~—ad/b and a(—ad/b) + 2bd = 1 => -a°’d +2b’d =b>d=
b/(2b? — a?) and ¢ = —a/(2b? — a’). (Note: 2b* — a? # 0 because V2 is irrational.)
Consequently, (a + 6V2)7! = [-a/(2b? — a?)] + [b/ (2b? — a2)
| V2, with [—a/(2b? — a”)],
[b/ (2b? — a*)] € Q. So Q[ V2] is a subfield of R.
Since s(x) = x* — 2 is irreducible over Q, we know from part (b) of Theorem 17.11 that
Qix]/(? — 2) is a field. Define the correspondence
f: QLxI/(? — 2) QI2], by fla + bx]) =a + bv2.
By an argument similar to the one given in Example 17.10 and part (a) of Exercise 24 it follows
that f is an isomorphism.
Section 17.3—p. 819
a)1 23 4 b)1 2 3 4 ec) 1 3 4 2
2 1 4 3 3.4 12 42 1 3
4321 21 4 3 3 12 4
3 4 1 2 432 1 243 1
a =a
oa fift+hahht+hoeh=fhai=i
L3:3 4 5 1 2 3 Ly 5 1 2 3 4
23 45 1 4512 3
5 123 4 3 45 1 2
3 45 12 23 45 1
23 45 123 4 5
In standard form the Latin squares L,, 1 <i <4, become
Li: 12 3 4 °5 Ly: 12 3 4 5
23 45 1 3 45 12
3 4 5 12 5 123 4
4 5 12 3 2345 1
5 123 4 45 12 3
Ly 123 4 5 Li: 123 4 °5
45 12 3 5 12 3 4
23 45 1 45 12 3
5 123 4 3 45 1 2
3 4 5 12 23 45 1
7. Introduce a third factor, such as four types of transmission fluid or four types of tires.
5-88 Solutions
Section 17.4—p. 824
Number Number Number of Points Number of Lines
Field of Points of Lines on a Line on a Point
GF(5) 25 30 5 6
GF (3°) 81 90 9 10
GF(7) 49 56 7 8
GF (2*) 256 272 16 17
GF(3l) 961 992 31 32
3. There are nine points and twelve lines. These lines fall into four parallel classes.
(i) Slope of 0: y = 0;y = 1; y =2
(ii) Infinite slope: x = 0;x =1;* =2
(iii) Slope l: y=x;y=xt+ly=x42
(iv) Slope 2 (as shown in the figure): (1) y = 2x (2) y =2x4+1(3) y =2x +2
(0, 0) (1, 0)
The Latin square corresponding to the fourth parallel class is
3 1 2
23 1
1 2 3
~a) y=4r41 dD) y=3x4100r2x+3y+3=0
c) y = 10x or 10y = 11x
. a) Vertical line: x = c. The line y = mx + b intersects this vertical line at the unique point
(c, mc + b). As b takes on the values of F, there are no two column entries (on the line x = c)
that are the same.
Horizontal line: y = c. The line y = mx + 6 intersects this horizontal line at the unique
point (m~'(c — b), c). As b takes on the values of F, no two row entries (on the line y = c) are
the same.
Section 17.5-p. 829
12 3 4 13 5 7 23 67
245 7 3 4 5 6
—_
~]
—
_
- a) No b) No
~ a) AQ — 1) = rk — 1) = 2r 3 Av — 1) is even.
Av(v — 1) = ortk — 1) = bk(k — 1) = &(3) (2) > 6Av(v — 1)
A=1)b) 6fAv(v — 1) > 6/v(v — 1) 3 3) v(v — _1) > 3 |v or 3] (v — 1)
A(v — 1) even = (v — 1) even => v odd
3)v > v = 3t, t odd > v = 3(2s + 1) = 65 +3 and v = 3 (mod 6)
3\v —1) > v-—1=31r,teven>v-~—1=6x > v = 6x4 1 andv =1 (mod 6)
~v=9,r=4 ll. a) b=21 by) r=7
Solutions S-89
13. There are A blocks that contain both x and y. And since r is the replication number of the
design, it follows that r — A blocks contain x, but not y. Likewise there are r — A blocks
containing y but not x. Consequently, the number of blocks in the design that contain x or y is
(yr -—A)+(r-A)+A=2r—-d.
15. a) 31 b)8
17. a) v=b=3l;r=k=6. xr b) v=b=S57;r=k=8,A=1
e) v= b=T3,r=k=9A=
Supplementary
Exercises —p. 832 ~n=9 3. a) 31.) =b) 30sec) 29° de) K = 1000
. Foralla € Z,, a? =a [See part (a) of Exercise 13 at the end of Section 16,.3.], so a is a root of
x? —x, and x —a is a factor of x? — x. Since (Z,, +, +) isa field, the polynomial x” — x can
have at most p roots. Therefore x? — x = [Luez, (x — a).
. {1, 2, 4}, {2, 3, 5}, {4, 5, 7} 9. a) 9 Db) 91
. b) A- J, isav X b matrix whose (i, /)th entry is 7, since there are r 1’s in each row of A and
every entry in J, is 1. Hence A- J, = rJ,x,. Likewise, J, + A is av X b matrix whose (i, /)th
entry is k, because there are k 1’s in each column of A and every entry in J, is 1. Hence
Jy A=ke dyxp.
c) The (i, /)th entry in A - A" is obtained from the componentwise multiplication of rows i
and j of A. Ifi = /, this results in the number of 1’s in row i, whichis r. Fori # /, the number
of 1's is the number of times x, and x, appear in the same block — which is given by 4. Hence
A-AT =(r-A)L, +A.
d)| r A A A r
x r xr Xr Xr
x Xr r Xr Xr
x x xr r x
A x Xr AK tee
r aA-r h-r r-r «ss KF
Xn r-z 0 0 0
() Xr 0 r-i2z 0 ae 0
x 0 0 r-i - 0
x 0 0 0 ee po
rt+t(—-la 0 0 0 0
xr r-A 0 0 0
@) xr 0 r-—iA 0 0
A 0 0 r-2z 0
Xr 0 0 0 ree Fo
=[r+@—DaAlr—aytba=@—-ay ltr -—D) = rk — a)!
Key: (1) Multiply column 1 by —1 and add it to the other v — | columns.
(2) Add rows 2 through v to row 1.
Appendix 1
Exponential and Logarithmic Functions
p. A-9
3y7/4
a) /xy3 _— xl/2y3/2 b) VRAxSy3 = 3x 5/4 y3/4 _ ar
1.
10x?
c) 58x29 y-F = 5(8'3
x9 y5/3) = 5(2x3 p58) = yrs
$-90 Solutions
. a) 625 ~b) 1/343) 10
. a) log, 128=7 b) log;,,;5=1/3 ~—¢) log,, 1/10,000=—4 d) log, b=a
we ~I
in
-a) 3c) 3
. a) Proof (By Mathematical Induction):
For n = | the statement is log, r! = 1 - log, r, so the result is true for this first case. Assuming
the result for 2 = k (> 1) we have log, r* = k log, r. Now for the case where n = k + 1 we
find that log, r*+! = log,(r - r*) = log, r + log, r* [by part (1) of Theorem A1.2]
= log, r + k log, r (by the induction hypothesis) = (1 + k)log, r = (k + I)log, r. Therefore,
the result follows for all n € Z* by the Principle of Mathematical Induction.
b) For alln € Z*, log, r-” = log, (1/r”) = log, 1 — log, r” [by part (2) of Theorem A1.2] =
0 — n log, r [by part (a)] = (—n)log, r.
11. a) 1.5851 _b) 0.4307 se) :—:1.4650
13. a) 5/3 b) 3/2) 4
15. Let x = a!%&° and y = ca’, Then
x = al ° => log, x = log, [a’°] = (log, c)(log, a), and
y = cl&? => log, y = log, [c*® “] = (log, a) (log, c).
Consequently, we find that log, x = log, y, from which it follows that x = y.
Appendix 2
Matrices, Matrix Operations, and Determinants
p. A-21
3
1. a) A+B=| | biatarc=[3 5 1
wn
0 6 4
|
fl
c) B+c=| 4
Ww
d) a+e+o=|3 ° 1
An
|a
mH
2 24 =| —242 0 s6 | 6 24438 =| | ]
Monee
WU
_—
—
ON
“a
p2c+se=|525 20
95 —I15
35| hy sc =|
25
0
20
5 10
—15
i) 2B — 4c =| —182 —-2
-12
-6
20
i) A+2B—3c =| —-144 -8
0
20
0
k) 2138) =| 5§ 123 244 | ! 2-38 =| 66 12
6
24
6
. a) [12], or 12 »| 5 | c) [3 “|
—5 —7 8 a b c a b c
d) 29 21 2 e) d e f f) 3g 3h 3i
—23 -~—35 6 3g 3h 3i d e f
. a) (-1/5) J | b) I I c) The inverse does not exist. d) | ! |
3 1 () 2 $7
2 - 1 1 -2 —4
ay t= ar] 5 | b B= a/9)| 5 im o ap=| >]
1 2 —3 -lacl — 2 —3
d) casy' = asioy| § a 0 BA! =aslo| § a
Es oll] Ea]
BIB 2Ppomls sJlat-bs
Solutions S-91
11. det(2A) = 2°(31) = 124, det(SA) = 5°(31) = 775
13. a) 45 b) -40 c¢) 14
4 - 4 = 2(-1)**! + 3(-1)3*? so
15. af) |, 3 4 -l -l 0 -1
= 2(-2 — (-1)) —3(-1) = 2(-1I) +3 = 1.
(ii) 5 (iii) 25
b) (i) 51 (ii) 306 ~—(iii).:«4S10
Appendix 3
Countable and Uncountable Sets
p. A-32
1. a) True hb) False ¢) True d) True
e) False: Let A = ZU (0, 1] and B = ZU (1, 2]. Then A, B are both uncountable, but
AM B = Zis countable.
f) True
g) False: Let A = Z* U (0, 1] and B = (0, 1]. Then A, B are both uncountable, but
A — B = {2, 3, 4, ...} is countable.
. If B were countable, then by Theorem A3.3 it would follow that A is countable. This leads us to
a contradiction since we are given that A is uncountable.
. Since S, T are countably infinite, we know from Theorem A3.2 that we can write
S = {s1, $2, $3,...}and T = {t, t, t3, ...} two (infinite) sequences of distinct terms. Define
the function
f:SXToZ
by f(s, t,) = 2'3/, foralli,j eZ. Ifi, j,k, £6 Z* with f(s,, t,) = f (sx, te), then
f(S,,t)) = f (se, te) => 2'3/ = 243° > i =k, j = € (By the Fundamental Theorem of
Arithmetic) => s, = s, and t, = t => (s,, t,) = (sy, t/). Therefore, f is a one-to-one function
and $ X T ~ f(S X T) C Z*. So from Theorem A3.3 we know that S X T is countable.
. The function f: (Z — {O}) X Z x Z— Q given by f(a, b, c) = 2°3°5* is one-to-one (Verify
this!). So by Theorems A3.3 and A3.8 (Z — {0}) X Z X Z is countable. Now for all
(a, b, c) € (Z — {0}) X Z X Z there are at most two (distinct) real solutions for the quadratic
equation ax° + bx +c = 0. From Theorem A3.9 it then follows that the set of all real solutions
of the quadratic equations ax” + bx + c = 0, where a, b, c€ Z anda # 0, is countable.
Index
A, 138 Adjacency list, 379 encoding function, 763, 764. 767, 769,
|A|, 124 Adjacency list representation, 378, 379 771, 773
A®°, A", A*, A*, 315 Adjacency matrix (for a graph), 352, 539, equivalent codes, 778
A~ B,A-23 600 error, 762
a=b (mod n), 686 Adjacency of a pair of vertices, 352 error correction, 767-769
a-z cut, 645 Adjacent from, 349, 514 error detection, 767-769
a is congruent to b modulo n, 686 Adjacent mark ordering algorithm, 453, error pattern, 762, 763, 771, 779
Abel, Niels Henrik, 705, 745, 794, 830 506 five-times repetition code, 765, 769
Abelian group, 161, 745, 746, 799 Adjacent to, 349, 514 generator matrix, 769, 771, 772, 774,
Absolute value, 219, 224 Adjacent vertices, 349 q77
Absorption Laws, Adleman, Leonard, 759 Gilbert bound, 773
for a Boolean algebra, 735 Affine cipher, 691, 692, 759 Golay, Marcel, 761, 795, 796
for Boolean functions, 713 Affine plane, 820-822, 826-828, 831 group code, 773, 774, 776, 777
for Boolean variables, 713 Aggregate, 123 Hamming, Richard, 761, 766, 795, 796
for logic, 59 Aho, Alfred V., 378, 506, 507, 574, 575, Hamming bound, 773
for set theory, 139 623, 624, 642, 667, 668, 708 Hamming code, 778
Abstract algebra, 394, 624, 742 Ahuja, Ravendra K., 562, 575, 637, 643, Hamming matrix, 778
Access function, 254 654, 668 Hamming metric, 767
Achilles, 119 Albert, A. Adrian, 831 independent events, 762
Ackermann, Wilhelm, 259 Aleph, 303 (m + 1, m) parity-check code, 764,
Ackermann’s function, 259 Xo (aleph null), 303, A-30, A-31 765
Acronym, 155 Algebra, 123, 242 majority rule, 765
Aczel, Amir D., 706, 708 Algebra of logic, 742 message, 763, 769, 777, 778
Addition, 136, 137 Algebra of propositions, 55, 57, 58; see minimum distance between code
Addition of binary numbers, 720 also Laws of Logic words, 767-769, 771, 773, 774
Addition of equivalence classes Algebra of switching circuits, 742 minimum weight of nonzero code
of integers (in Z,,), 687 Algebra of switching functions, 711 words, 774
of polynomials, 809 Algebraic coding theory, 18, 761-779, mixed strategy, 768
Addition of matrices, A-12 795, 796 multiple errors, 763
Addition of polynomials, 800 binary representations, 778, 779 nearest neighbor, 771
Additions, 636, 637 binary symmetric channel, 762, 763 (n,m) block code, 764
Additive identity block code, 764 noise, 761
for matrices, A-13 code word, 763, 769, 771, 772, 774, parity-check code, 764, 765
for real numbers, 103 776-778 parity-check equations, 770, 777
for a ring, 674 coding schemes, 763 parity-check matrix, 772, 774,
Additive inverse coset leader, 775-777 776-779
for matrices, A-13 d(x, y), 766 probability, 761-765
for integers, 278 decoding, 763 rate of a code, 764, 778
for real numbers, 103 decoding algorithm, 772 received word, 762, 763, 777
for a ring element, 674, 679, 680, 701 decoding by coset leaders, 776 retransmission, 765, 769
Additive Rule, 162, 168, 172 decoding function, 764, 767 Shannon, Claude Elwood, 761, 795,
Address decoding scheme, 769 797
class A address, 12 decoding table, 774, 775 sphere (S(x, k)), 767
class B address, 12 decoding table with syndromes, 776 S(x, k), 767
class C address, 12 distance, 766 syndrome, 771, 775-777, 779
in computer memory, 5, 694 distance function, 766, 767 systematic form, 778
in a universal address system, 589 dual code, 773 transmission error, 762, 767
internet address, 12 efficiency of a coding scheme, 764 triangle inequality, 767
local address, 12 encoding, 763 triple repetition code, 765, 768, 769
1-2 Index
weight, 766 Alternating sequence, 650 Associated homogeneous relation,
wt (x), 766 Alternating triple, 135 471473, 479, 480
Algebraic expression, 590 Alternative form of the Principle of Associated minor, A-20
Algebraic formulae, 623 Mathematical Induction, 206-208, Associated undirected graph, 350, 353,
Algebraic structures, 745, 761 217, 238, 298, 458, 503, 582, 583 517, 645. 650
Algebraic substitution, 449 American Journal of Mathematics, 411 Associative binary operation, 268
Algorism, 242 American National Standards Institute, Associative closed binary operation, 311
Algorithm, 41, 42, 233, 242-244, 289, 125 Associative law
290, 294, 295, 297, 299-301, 349, 378, Analysis, 444 of addition for integers, 113
442, 599, 605, 613, 615, 619-621, Analysis of algorithms, 3, 247, 259, 292, of addition for real numbers, 97
624, 632, 633, 636-642, 649, 653 294-300, 304, 305, 453, 473, 503, of multiplication for integers, 221
Algorithms A-1,A-6 of multiplication for matrices, A-16
adjacent mark ordering, 458, 506 Analytic Theory of Probability, 150, 188 of multiplication for polynomials, 801
articulation points, 619, 620 Analytical engine, 242 Associative laws
biconnected components, 619, 620 Analytische Zahlentheorie, 304 for a Boolean algebra, 736
binary search, 501-503 Ancestor, 588, 616-619 for Boolean functions, 713
breadth-first search, 598, 599 And, 48, 50 for Boolean variables, 713
bubble sort, 450 AND gate, 149, 719, 720 for logic, 58
decoding, 772 Annals of Mathematics, 706 for a ring, 673, 746
depth-first search, 597, 598, 617 ANSI FORTRAN, 125 for set theory, 139
Dijkstra’s shortest-path, 633, 634, 667, Antichain, 381 Associative property
668 Antisymmeitric property (of a relation), for composition of relations, 345
divide-and-conquer, 496-503 340, 341, 347, 348, 353, 357, 358, for function composition, 281, 282,
Edmonds-Karp algorithm, 653-657, 376, 377 345, 750
663 Anton, Howard, A-21 in a group, 745, 794
Euclidean algorithm for integers, 232, AP(F), 822, 824, 826-828 Associativity for Cartesian products, 248
233 Apianus, Petrus, 188 Atkins, Derek, 795
Euclidean algorithm for polynomials, Appel, Kenneth, 565, 573, 575 Atkins, Joel E., 623, 624
808 Application specific integrated circuit, Atom of a Boolean algebra, 738-740, 743
exponentiation, 297-299 149 AT&T Bell Laboratories. 188
Fibonacci numbers, 477, 478 Applied Boolean algebra, 742 Augarten, Stan, 243, 244
Ford-Fulkerson algorithm, 654-657, Approximately equal (=), 7 Auluck, F, C., 463, 507
663 Approximation theory, 304 Automata theory, 333
generating permutations, 453, 506 Arbitrary, 110 Automated reasoning, 119
greatest common divisor, 232, 233 Arc, 321, 329, 349, 514 Auxiliary variables, 461]
greatest common divisor (recursive), Argue by the converse, 74, 82, 109, 547 Average-case complexity, 295, 296
455 Argue by the inverse, 75, 82, 110 Axiomatic approach to probability, 188
Huffman tree, 613 Argument, 47, 53, 67, 72, 74, 75, 107, Axioms of probability, 159, 161
Kruskal’s algorithm, 639-641 108, 112
Hnear search, 296, 302 Aristotle, 117, 118, 238 b,, the n-th Catalan number, 38, 490
maximum value, 301 Arithmetic expression, 460 Baase, Sara, 305, 624, 625, 641, 642,
merge sort, 496, 608 Arithmetic of remainders, 234 667, 668
merging two sorted lists, 607 Arithmetica, 243 Babbage, Charles, 242, 243
minimization process for a finite state Arithmetica Integra, 42 Bachmann, Paul Gustav Heinrich, 304
machine, 372-373 Arithmeticorum Libri Duo, 244 Back edge (of a tree), 616-619, 621
nonisomorphic trees on 7» labeled Arrangement, 6-10, 15-18, 24, 26-28, Backtrack(ing), 331, 593, 596-598, 600,
vertices, 586, 587 34, 36-39, 41, 149, 155, 160, 266, 616, 653, 656
polynomial evaluation, 301 310, 395, 402, 406, 411, 436, 437, 439, Backward edge, 650, 651, 654
Prim’s algorithm, 641-643, 668 462, 463, 524, 525, 559; see also Balanced complete binary tree, 605, 606
Priifer code for a labeled tree, 586, 587 Permutation Balanced incomplete block design, 825,
searching an array, 295, 296 Arrangements with forbidden positions, 826
topological sorting algorithm, 360, 361 406-410 Balanced (rooted) tree, 601, 602
universal address system, 589 Arrangements (with repetition), 7, 26, 27 Ball, M. O., 562, 575, 576
Algorithmic manner, 631 Array, 91, 450, 501-503 Ballot Problem, 45
Alfa, 226 Ars Conjectandi, 41 Bare roundhouse, 192
Al-jabr, 242 Articulation point, 615-621, 624 Barnette, David, 575
Alkane, 584 Articulation point algorithm, 619, 620 Barnier, William J., 333, 334
Al-Khowéarizmi, Abu Ja’ far Mohammed Ascending order, 450, 606 Barr, Thomas H., 693, 708. 795, 796
ibn Misa, 242 Ascent (in a permutation), 220 Barwise, Jon, 119, 120
Allowable choices, 87 Aschbacher, Michael, 795 Base (for a number system), 225
a, 457, 458, 469 ASIC, 149 Base (for a recursive definition), 211-213
Alpha testing, 185 Assembly language, 226 Base (for exponentiation), A-1
Alphabet, 18, 309-311, 313, 315, 316, Assignment problem, 659, 668 Base 2, 225, 227, 608
337, 338, 609, 610 Assmus, E. F,, Jr., 796 Base 8, 225
Alphabetical ordering, 589 Associated directed graph, 350 Base 10, 225, 226
Index 1-3
Base 16, 226, 227 Binary string, 128, 129, 188 DeMorgan’s laws, 713
Base-changing formula for logarithms, Binary symmetric channel, 762, 763: see disjunctive normal form, 715
A-7 also Algebraic coding theory distributive laws, 713
Base step, 316 Binary tree, 488, 595, 600 d.n.f., 715-718, 721-724
Basic connectives, 47-53, 56 Binet, Jacques Philippe Marie, 457 dominance laws, 713
and (conjunction), 48, 50 Binet form, 457 don’t care conditions, 73 1-733
but, 50 Binomial coefficient, 22, 23, 42, 133, 217 equality, 712
exclusive or, 48 Binomial distribution, 179 exclusive or, 719, 720
if...then (implication), 48 Binomial expansion, 30; see also F,,719
if and only if (bicondition), 48 Binomial theorem fundamental conjunction, 715-718,
iff, 48 Binomial random variable, 179, 180, 721, 732, 738
inclusive or (disjunction), 48 182, 183, 430 fundamental disjunction, 717, 718
nand, 56 Binomial theorem, 21-23, 42, 106, 130, idempotent laws, 713
negation (not), 48 180, 188, 390, 421, 422, 436, 443 identity laws, 713
nor, 56 Binomial theorem (generalized), 422, incompletely specified, 731, 732
or (disjunction), 48 443 inverse laws, 713
Basis, 447 Bipartite graph, 541, 542, 558, 659, 660, Karnaugh map, 722-727
Basis step, 195-197, 199, 201, 202, 662-665 law of the double complement, 713
204-208, 212-214, 218, 317 Birkhoff, Garrett, 377 literal, 715, 716, 722-726
Bayes, Thomas, 188, 189 Birkhoff-von Neumann theorem, 670 maxterm, 717, 718, 727
Bayes’ Theorem, 170, 173, 188 Bit(s), 5, 225, 324, 610, 720, 742 minimal product of sums
Bayes’ Theorem (Extended Version), 173 Blank (space), 311 representation, 727
Beckenbach, Edwin F., 796 Bletchley Park, 333 minimal sum of products
Bell, Eric Temple, 508 Blocher, Heidi, 708 representation, 721, 722, 724, 725,
Bell numbers, 508 Block 729-733
Bellman, R., 562, 575 in a design, 825-827 minterm, 716, 717, 732, 738
Bellmore, M., 562, 574, 575 of a partition, 366 product, 712
Berge, Claude, 573, 668 Block code, 764; see also Algebraic product of maxterms, 717, 718
Berger, Thomas R., 707, 708, 831, 832 coding theory Quine-McCluskey method, 727, 742
Bernays, Paul, 119 Block designs, 825-829, 832 row number, 716
Bernoulli, Jakob, 41, 42 Bonaccio, 442 self-dual, 744
Bernoulli, Johann, 302 Bond, James, 150 sum, 712
Bernoulli trial, 161, 178, 179, 182, 430 Bondy, J. A., 573, 575, 668 sum of minterms, 717
Bertrand, Joseph Louis Frangois, 45 Bonferroni's Inequality, 191 symmetric, 744
Best-case complexity, 295 The Book of Creation (Sefer Yetzirah), 41 Boolean multiplication, 711
B (blank, space), 311 Boole, George, 118, 119, 186, 188, 377, Boolean ring, 709
B [= (1 — ¥5)/2), 457 711, 742 Boolean sum, 737
B(G), 564, 666 Boolean addition, 346, 711 Boolean variable, 712, 713, 724, 729
Biconditional, 48, 51, 52, 56, 104, 105 Boolean algebra, 711, 714, 733-743, Booth, Taylor L., 742, 743
Biconnected component, 615, 619-621, 799, 830 Borchardt, Carl Wilhelm, 622, 623
624 atom, 738-740, 743 Bortivka, Otakar, 667
Biconnected component algorithm, 619, definition, 733 Bose, Raj Chandra, 819, 831
620 dual, 735 Bound, 292
Biconnected graph, 615 Hasse diagram, 736-739 Bound variable, 88, 98
Big-Oh notation, 290, 304 isomorphism, 737, 739, 740 Boundary condition(s), 448
Big-Omega notation, 293, 505 linear combination of atoms, 738 Boundary of a region, 546-549
Big-Theta notation, 294, 505 partial order, 737, 738 Bounded above, 605, 608
Biggs, Norman L., 41, 42, 574, 575 principle of duality, 735 Boyer, Carl Benjamin, 189
Bijective function, 279, 283 properties, 735, 736 Brahmagupta, 707
Binary compare, 727 representation theorem, 738, 739, 743 Braille system, 24
Binary digits (bits), 5, 720 Boolean algebra of sets, 740, 743 Branch node, 588
Binary heap, 637 Boolean complement, 711 Branches (ofa tree), 154, 249, 331, 488,
Binary label, 532, 716-718, 742 Boolean expression, 720 614
Binary number system, 225, 226 Boolean function, 711-727, 729-733, Bravo, 226
Binary numbers, 323. 770 738, 742, 744, 796 Breadth-first search, 598-600, 624, 653
Binary operation, 136, 193, 211, 267-269 absorption laws, 713 Breadth-first search algorithm, 598, 599
460, 589, 591, 673, 674, 686, 745 associative laws, 713 Breadth-first spanning tree, 599, 656
associative, 268 binary label, 716-718 Bridge, 550
commutative, 268 Boolean function for a prescribed Bridges of Kénigsberg, 513, 518,
Binary relation, 250, 337; see also table, 714, 715 $33-535, 573
Relation e.n.f.. 717, 718 Brookshear, J. Glenn, 333, 334
Binary representation, 229, 693, 778, 779 commutative laws, 713 Brualdi, Richard A., 506, 507
Binary rooted tree, 589, 590, 594, 595 complement, 712 Bubble sort, 450-452, 455, 605, 606, 609
Binary search algorithm, 501-503 conjunctive normal form, 717 Buckley, Fred, 573, 575
Binary sequence, 461, 462, 610, 611 definition, 712 Burnside’s Theorem, 783-785, 796
1-4 Index
Busacker, Robert G., 668 Chain (poset), 381 Cocycle, 564
Bussey, W. H., 244, 831 Chain (transport network), 650 Code, 129, 610
But, 50 Chain of subgroups, 830 Code word, 763, 769, 771, 772, 774,
Butane, 584 Change in state, 319 776-778, see also Algebraic coding
Bye, 602 Change of base, 225-230 theory
Byron, Augusta Ada, 242, 243 Char(R), 812 Coding schemes, 128, 610, 763; see also
Byron, Lord, 242 Characteristic equation, 456 Algebraic coding theory
Byte, 5, 225 Characteristic function, 307 Coding theory, 3, 41, 161, 324, 574, 575,
Characteristic of a ring, 812 581, 609, 745, 761, 831; see also
c (continuum), A-30, A-31 Characteristic roots, 456, 468 Algebraic coding theory
c(e), 644 Characteristic sequence, 625 Coding Theory—prefix codes, 575-579
c(P, P), 646 Charlie, 226 Codomain, 175, 253, 279, 281, 287, 323,
C, C*, 134 Chartrand, Gary, 573-575 702
C(n, r), 15, 41, 436 Chebyshev, Pafnuty Lvovich, 188 Cohen, Daniel I. A., 42, 304, 305
C++, 4, 13, 369 Chebyshev’s Inequality, 183, 184, 188 Collection, 123, 135, A-29
C++ compiler, 253, 369 Chemical isomers, 622, 796; see also Collinear, 820, 822, 827
Caesar, Gaius Julius, 690 Isomers Collision, 694, 708
Caesar cipher, 690, 696 Chemistry, 574, 584 Collison, Mary Joan, 244
Calculational techniques for generating Chess, 404 Color-critical graph, 573, 622
functions, 418-431 Chessboard, 121, 208, 209, 404-409, Coloring, 551
Calculus, 99, A-3, A-6 458, 464, 470, 510 Column major implementation, 259
The Calculus of Inference, Necessary x(G), 565, 621 Column matrix, A-11
and Probable, 118 Child, 588, 590, 594, 598, 617-620 Column vector, A-11
Call, Gregory S., 304, 305 Children, 589, 594, 595, 607, 613, Comb graph, 577
Cambridge University, 705 617-621 Combinational circuit, 309
Campbell, Douglas M., 507 Chinese Remainder Theorem, 702-704, Combinations, 14-17, 21, 26, 41, 42,
Cancel, 221 707, 708 411, 436, 453, 506
Cancellation law of multiplication, 678, Choice and Chance, 411 Combinations with repetition, 26-29, 41
681 Chromatic number, 413, 565, 615, 621 Combinatorial analysis, 796
Cancellation laws Chromatic polynomial, 413, 564-571, Combinatorial approach, 132
for a Boolean algebra, 736 574 Combinatorial argument, 385
for a group, 747 Chu Shi-kie, 188 Combinatorial designs, 707, 799, 815,
of addition (in a ring), 680 Chvatal, V., 573 820-832
Cantor, Georg, 135, 186-188, 303, 304, Cipher machine, 333 affine plane, 820-822, 826-828, 831
A-28 Cipher shift, 690, 759 balanced incomplete block design,
Cantor’s diagonal method, 303, A-28 Ciphertext, 690-692, 760 825, 826
Capacity, 644 Circuit, 516, 528, 533, 534, 551 block designs, 825-829, 832
Capacity for a vertex, 657 Circular arrangements, 10, 395, 784 finite geometry, 799, 820, 822, 825,
Capacity of a cut, 646, 665 Circular disks, 472, 473 830, 831
Capacity of an edge, 631, 644, 645, 650, Circular tables, 266 Latin squares, 799, 815-820, 822-824,
654, 657, 661, 663 Clairaut, Alexis, 303 831
Carbon atom, 583, 584, 792 Clark, Dean S., 305 projective plane, 827, 828
Cardinal number, A-31 Class, 123, 780, 782 (v, b, r, k, A)-design, 825, 826, 831
Cardinality (of a set), 124, 186, A-23, Class A address, 12 Combinatorial identity, 30, 131, 188, 288
A-27 Class B address, 12 Combinatorial mathematics, 385, 405
Carroll, Lewis, 119 Class C address, 12 Combinatorial proof, 10, 33, 47, 128,
Carry, 323, 324, 720, 721 Class representative, 687 259, 264, 388. 390
Cartesian product, 152-154, 248, 249, Classification schemes, 667 Combinatorics, 123, 575, 761
251 Clauses, 86 Common divisor, 231
Case-by-case verification, 105 Clique, 578 Common multiple, 236
Castle, 404 Clique number, 578 Common ratio, 447
Catalan, Eugéne Charles, 38, 490, 494 Closed, 136-138 Commutative binary operation, 268, 270,
Catalan numbers, 36—39, 361, 490-493, Closed binary operation, 136, 267, 268, 311
506, 507, 586, 695, 696 270, 278, 311-313, 673, 674, 686, Commutative group, 745
Caterpillar, 627, 628 697, 705, 711, 733, 745, 746, 800, 801 Commutative k-ary operation, 306
Cauchy, Augustin-Louis, 795, 796 Closed interval, 134 Commutative law of addition for
Cayley, Arthur, 411, 565, 581, 622, 623, Closed path, 351 integers, 113
794, 795, A-11 Closed switch, 64, 551, 553 Commutative law of addition for real
Ceiling function, 254, 496, 602, 623 Closed under a binary operation, 136, numbers, 97
Cell (memory), 5 193, 248, 356 Commutative law of addition for a ring,
Cell (ofa partition), 366, 367, 369, Closed walk, 515, 516, 546, 549 673
372-375 Closure for a group, 745, 774, 783 Commutative law of matrix addition,
Center of a group, 751 c.n.f,, 717, 718 A-12
Center of a ring, 709 Coalescing of vertices, 567, 569 Commutative law of multiplication for
Central Limit Theorem, 188 Cobweb Theorem, 506 real numbers, 97
Index 1-5
Commutative laws Computer implementation, 667, 727, 742 Converse of a quantified implication,
for a Boolean algebra, 734 Computer network, 638 92-94
for Boolean functions, 713 Computer program, 260, 309, 349, 350, Convex polygon, 494
for Boolean variables, 713 597 Convolution (of sequences), 430, 431,
for logic, 58 Computer programming, 51, 574 440, 488
for set theory, 139 Computer recognition of relation Cooke, K, L., 562, 575
Commutative ring, 675, 700, 801 properties, 348 Corleone, Don Vito, 186, 692
Commutative ring with unity, 677, 678, Computer science, 32, 41, 51, 91, 119, Corleone, Michael, 692
681, 687, 743, 802, 810 225, 244, 247, 250, 252, 253, 259, Cormen, Thomas H., 504, 507, 624, 625,
Comparison of coefficients, 426 323, 324, 350, 377, 378, 460, 490, 638, 643, 654, 667, 668
Comparisons, 450, 452, 473, 474, 500, 574, 575, 589, 673, A-1, A-6 Corners of a Karnaugh map, 725, 726
502, 503, 605-608, 636, 637, 641 Computer security, 222 Corollary, 106
Compiler, 253, 290, 302, 605 Computer simulation, 689 Correspondence, 21, 26, 27, 30, 37, 39,
Complement (logic gate), 719 Computer’s main memory, 5 131, 205, 279
Complement in a Boolean algebra, 739 Concatenation of languages, 313-315 Coset, 757, 774-776, 795
Complement in a cut, 646 Concatenation of strings, 311, 312 Coset leader, 775-777; see also
Complement of a Boolean function, 712 Conclusion, 48, 51, 53, 67, 70, 107, 109, Algebraic coding theory
Complement of a graph, 523, 543 111, 112 Countable set, 164, 303, A-24-A-32
Complement of a set, 138, 287 Concurrent processing, 350 Countably infinite sample space, 177,
Complement of a subgraph in a graph, Condition, 166 183, 428
586 Conditional probability, 166-173 Countably infinite set, 164, 428, A-25,
Complementary (v, b, r, k, A)-design, Congruence, 377 A-30
833 Congruence modulo x, 689, 690 Counterexample, 83, 84, 89, 91, 94, 114,
Complete binary tree, 589, 595, 596, 600, Congruence modulo p, 830 115
605, 606, 610, 611, 613 Congruence modulo s(x), 808, 810, 830 Countess of Lovelace, 242, 243
Complete binary tree for a set of weights, Congruence of triangles, 55 Counties on a map of England, 565
612 Conjugate of a complex number, 466 Counting, 3, 10
Complete bipartite graph, 541 Conjunction, 48, 53, 57, 70, 75 Counting formulas, 148
Complete directed graph, 559 Conjunction (logic gate), 719 Coupled switches, 65
Complete graph (X,,), 352, 354, 480, Conjunctive normal form (c.n.f.), 717, Covalency, 825
523, 531, 558, 569 742 Covering ofa graph, 577
Complete inventory, 786, 789 Connected components, 352, 517 Covering number (of a graph), 577
Complete m-ary tree, 600-602 Connected graph, 351, 488, 517 Cross product, 152, 154, 248, 250, 270,
Complete matching, 660-664 Connectives, see Basic connectives 314
Complete ternary tree, 603 Conservation condition, 645, 651 Cryptanalysis, 333
Complex conjugates, 465 Conservation of flow, 649 Cryptography, 244
Complex numbers, 134, 356, 465 Constant (of a polynomial), 799 Cryptology, 693, 708, 745
Complex roots, 464-467 Constant coefficients, 448 Cryptosystem, 690, 693
Complexity function, 295 Constant Boolean function, 713 Cube, 547, 548, 791
Component flag, 641 Constant function, 261 Cubic equation, 794
Component statement, 49 Constant order, 293 Cubic order, 293
Components of a graph, 352, 353, 517, Constant polynomial, 800 Cubic time complexity, 293
534, 546, 549, 567, 581, 585, 615, Constant term, 799 Cut (in a transport network), 645-648,
640, 646 Constant time complexity, 293 652, 661, 662
Composite function, 280, 281 Constanzia, 186 Cut-set, 549-551, 553, 624, 645
Composite integer, 222, 230 Construction of Cycle detection, 641
Composite primary key, 272 finite fields, 799 Cycle in a graph, 351, 488, 516, 527,
Composite relation, 344 a Huffman tree, 613 532, 551-553, 556, 558, 581, 639-641]
Composition of functions, 280, 282, 344, Latin squares, 817, 818 Cycle index, 787, 789
A-9 Constructive proof, 223, 660, 665 Cycle structure representation, 786, 787,
Composition of relations, 344 Contacts, 551, 552 789
Compositions of integers, 30-32, 130, Contiguous, 462, 495 Cyclic group, 753-756, 809, 812
131, 205, 423-426, 448, 460 Continuous random variable, 175, 183 Czekanowski, Jan, 667
Compound statement, 48, 49, 52, 53, 61, Continuous sample space, 164
63, 71, 80 Continuum, A-30 d,, 402, 403, 410
Computational complexity, 289-293, Contradiction, 53, 58, 76, 77, 80, 115 d(a, b), 632
503, 575 Contrapositive, 62, 63, 92-94, 99, 115, d(x, y), 766; see also Algebraic coding
Computer, 290, 309, 377, 605, 623, 631, 116, 362 theory
694 Contrapositive method of proof, 76, 114, Dantzig, G. B., 668, 669
Computer addition of binary numbers, 115 Data structures, 129, 247, 348, 349, 378,
720 Control circuits, 309 487, 490, 581, 592, 598, 605, 623,
Computer algebra system, 477, 485 Convergence, 419, 429 637, 641, 694
Computer algorithm, 242, 243, 574 Converse of a relation, 282 Databases, 8
Computer architecture, 531 Converse of an implication, 62, 63, 82, Datagram, 13
Computer hardware, 326 99 Date, C. J., 305
1-6 Index
Dauben, Joseph Warren, 304, 305 Descent (in a permutation), 220 Disjunction (logic gate), 719
David, Florence Nightingale, 189 Design of experiments, 815, 825, 831 Disjunctive normal form (d.n.f.), 715,
De Arte Combinatoria, 118 Determinant, 411, 466, 467, A-17—A-21 742
DeBruijn, Nicolaas Govert, 796 Dfi(v), 616, 619-621 Dispersion, 180
Decimal (base 10) representation, 459 Diagonal, 781 Distance (in a graph), 518, 626, 631
Decision structure, 51 Dick, Auguste, 707, 708 Distance function, 766, 767; see also
Decison tree, 602, 603 Dickson, Leonard Eugene, 243, 244 Algebraic coding theory
Declarative sentence, 47, 86 Dictionary order, 589 Distinct real roots (for a recurrence
Decoding, 763; see also Algebraic Dierckman, Jeffrey S., 623, 624 relation), 456-464
coding theory Difference equations, 447; see also Distinguishing string, 374
Decoding algorithm, 772; see also Recurrence relations Distributions, 29, 150, 263, 264, 304,
Algebraic coding theory Differential equations, 447 370, 403, 416, 444, 493
Decoding function, 767; see also Digital computer, 309, 320, 581, 719 Distributive Law
Algebraic coding theory Digital devices, 329, 332 of matrix multiplication over matrix
Decoding table, 774, 775; see also Digraph, 349, 352, 514; see also Directed addition, A-21
Algebraic coding theory graph of multiplication over addition for
Decoding table with syndromes, 776; see Dijkstra, Edsger Wybe, 632, 667, 669 integers, 221
also Algebraic coding theory Dijkstra’s Shortest-Path Algorithm, of multiplication over addition for real
Decoding with coset leaders, 776, see 631-638 numbers, 57
also Algebraic coding theory Dinitz, Jetfrey H., 831 of scalar multiplication over matrix
Decomposition (of a permutation), 781 Diophantine equation, 235, 243 addition, A-13
Decomposition theorem for chromatic Diophantus (of Alexandria), 235, 243 Distributive Laws
polynomials, 568 Direct argument, 114 for a Boolean algebra, 734
Decryption, 690-693 Direct product of cyclic groups of prime for Boolean functions, 713
Decryption function, 759 power order, 795 for Boolean variables, 713
Dedekind, Richard, 243, 303, 377, 706, Direct product of groups, 751 for logic, 58
795 Direct proof, 114, 115 for a ring, 799
Dedekind domain, 706 Directed arrow, 514 for set theory, 139
Deductive reasoning, 117 Directed cycle, 351, 358, 516 Divide-and-conquer algorithms,
Deficiency of a graph, 664 Directed edge, 321, 349, 351, 514, 646, 496-503, 507, 606
Deficiency of a set of vertices, 664 650 Dividend, 223
Definition, 52, 87, 98, 103-105, 113 Directed Euler circuit, 535, 536 Divides (for integers), 221
Deg (R), 546 Directed Euler trail, 539 Divides (for polynomials), 802
Deg(v), 530 Directed graph, 337, 344, 347, 349, 350, Divides relation, 339, 737
Degree 0, 800 351, 353, 357, 377, 378, 488, 514, Division algorithm
Degree of a polynomial, 799 587, 631, 632, 644 for integers, 221, 223, 225, 232, 236,
Degree of a region, 546 arcs, 349, 514 254, 274, 276, 289, 686, 754, 756
Degree of a table, 271 associated undirected graph, 350, 353, for polynomials, 803-805, 808-810
Degree of a vertex, 530, 533 517 Division method (for hashing), 694
Delays, 332, 722 edges, 349, 514 Divisor
Deletion, 490 loop, 349, 514 for integers, 221, 223, 342, 361
Delong, Howard, 119, 120 nodes, 349, 514 for polynomials, 802
Delta, 226 strongly connected, 351, 539 Divisors of zero; see Proper divisors of
5(G), 664, 665 vertices, 349, 514 zero
DeMoivre, Abraham, 304, 411, 443, 505 Directed Hamilton path, 559 d.n.f., 715-718, 721-724
DeMoivre’s Theorem, 208, 464, 465 Directed path, 353, 516, 588, 632, 633, Doctrine of Chances, 411
DeMorgan, Augustus, 118, 186, 242, 646, 649, 650, 652, 653 Dodecahedron, 548, 556, 573
244, 565 Directed tree, 587 Domain (of a function), 175, 253, 257,
DeMorgan’s Laws Directed walk, 516 270, 281, 287
for a Boolean algebra, 736 Dirichlet, Peter Gustave Lejeune, 303, Domain (of a relational data base), 271
for Boolean functions, 713 705 Dombowski, Peter, 831
for Boolean variables, 713 Dirichlet drawer principle, 303; see also Dominance (for functions), 290, 291
for logic, 57, 58, 60-62 Pigeonhole principle Dominance Laws
for set theory, 139-141, 148, 149, 163, Disconnected graph, 352, 517 for a Boolean algebra, 735
214 Discrete function, 448, 452, 486 for Boolean functions, 713
Denumerable set, 303, A-24 Discrete probability, 189 for Boolean variables, 713
Deo, Narsingh, 506, 508, 574, 576 Discrete random variable, 428, 430 Dominates (for functions), 290, 291
Depth-first index, 616, 619 Discrete sample space, 164, 175 Dominates (on a set), 498
Depth-first search, 597, 598, 600, 617, Disjoint collection of sets, A-29, A-30 Dominating set, 577, 730
624 Disjoint cycles, 786 Domination Laws
Depth-first search algorithm, 597, 598, Disjoint events, 159, 169, 170, 172 for logic, 59
617 Disjoint sets, 137, 148; see also Mutually for set theory, 139
Depth-first spanning tree, 615-620 disjoint Domination number of a graph, 577
Derangement, 402. 403, 410, 412 Disjoint subboards, 404, 405, 408, 409 Domino, 121, 195, 196, 470
Descendant, 588, 616-619 Disjunction, 48, 56, 57 Don’t care conditions, 731-733
Index 1-7
Dornhoff, Larry L., 333, 334, 778, 796 England, 565 Even, Shimon, 490, 507
Dorwart, Harold L., 831, 832 Enigma, 333 Even integer, 104, 105, 113
Double induction, 306 Enumeration, 3, 9, 41, 186, 188, 385, Even parity string, 332
Double negation, 58 391, 394, 411, 415, 439, 622, 623, 673 Event, 151, 158, 159, 168, 171, 262
Doubly linked lists, 378 Enumeration of nonisomorphic labeled Bernoulli trial, 161, 178, 179, 182, 430
Doubly stochastic matrix, 670 trees, 586, 587 elementary event, 158
Dual code, 773; see also Algebraic Epp, Susanna S., 119, 120 Evert, Christine Marie, 54
coding theory Equal likelihood, 150, 151 Eves, Howard, 119, 120, 304, 305
Dual graph, 549, 55] Equality Excel, 117
Dual network, 551-553 of Boolean functions, 712 Exclusive or, 48, 56, 416, 789
Dual of a statement, 59, 62, 140, 141, of equivalence classes, 368 Exclusive or (@) for Boolean functions,
713,735 of functions, 279 719, 720
Duality of matrices, A-12 EXCLUSIVE-OR gate, 728
in a Boolean algebra, 713, 735 of polynomials, 799 Execution speed, 290
in logic, 59 of real numbers, 55 Exhaustion (Method of), 106
in set theory, 140, 141] of sets, 125, 143, 367 Exhaustive, 457, 474
Dyck, Walther Franz Anton von, 794 of strings, 311 Existence of an identity for a group, 745
Equality relation, 342, 366, 377 Existence of an identity for a ring, 673
E,, Ex, E, 371 Equilateral triangle, 475 Existence of inverses in a group, 745
Em, 374 Equivalence class, 367, 368, 371, 377 Existence of inverses under + for a ring,
E(X), 177, 182, 183 Equivalence problem, 378 673
East Prussia, 533 Equivalence relation, 337, 342, 343, 353, Existential generalization, 117
Echo, 226 366-378, 686, 695, 735, 780, 782, Existential quantifier (3), 87, 88, 94, 96,
Economics, 506 783, 808, 830 98
Edge, 349, 514 block, 366 Existential specification, 117
Edge of minimal weight, 640 cell, 366, 367, 369, 372-375 Expansion by minors, A-20
Edge set, 349, 514 definition, 342 Expectation, 177
Edge-disjoint paths, 658 equivalence class, 367, 368, 371, 377 Expected value, 177, 179, 180
Edmonds, J., 653, 654, 669 partition, 366-375, 377, 378 Experiment, 150-154, 157, 159, 162,
Edmonds-Karp algorithm, 653-657 Stirling numbers of the second kind, 163, 166, 167, 175, 178, 180, 183
Efficiency of a coding scheme, 764; see 370 Explicit formula, 210, 211
also Algebraic coding theory Equivalent codes, 778; see also Explicit quantifier, 89, 90
Efficient procedure, 200 Algebraic coding theory Exponent, A-1, A-2
Efficient tree, 611 Equivalent finite state machines, 327 Exponential function, 402, A-1, A-5
Einstein, Albert, 707 Equivalent open statements, 92 Exponential generating function,
Electric power network, 667 Equivalent states (s; Es2), 338, 371 436-439, 443, 444, 474
Electric switch, 711 Eratosthenes, 243 Exponential order, 293
Electrical engineering, 324 Erdos, Paul, 276, 573, 574 Exponential time complexity, 293
Electrical network, 551, 573, 574, 581, Erlanger Programm, 795 Exponentiation algorithm, 297-299
622 Error correction (in a code), 767-769; Extension of a function, 257
Electronic realizations of Boolean see also Algebraic coding theory
functions, 796 Error detection (in a code); 767-769; see f:A— B,252
Element, 123, 124, 129, 135 also Algebraic coding theory f, 712
Element argument, 126, 137, 140, 144 Error in reasoning, 74 f—', 283
Elementary event, 158 Error pattern, 762, 763, 771, 779; see f(A), 253
Elementary subdivision, 542, 543 also Algebraic coding theory f~' (Bi), 285
Elements, 222, 237, 238, 242 Euclid, 42, 222, 232, 237, 238, 242, 243 f € O(g), 290, 291
Elements of a set, 123 Euclidean algorithm, f € O(g) on S, 498
Elsayed, E. A., 562, 575, 576 for integers, 231-235, 289, 454, 458, f € O(g), 294
Else, 51 459, 505, 688, 760 f €2(g), 293
Embedded microcontroller, 5 for polynomials, 808 f is dominated by g, 290, 291, 341
Embedding, 540, 545 Euclidean geometry, 820 f is dominated by g on S$, 498
Empty language, 313 Euler, Leonard, 303, 378, 443, 494, 513, f(x) = g(x) (mod s(+)), 808
Empty set (@), 127, 128, 159 533, 544, 573, 705, 794, 819, 831 f(x) is congruent to g(x) modulo s(x),
Empty string (A), 310, 323 Euler circuit, 534, 535, 556 808
Encoding, 763; see also Algebraic coding Euler number, 495 f-augmenting path, 650-654, 656, 663
theory Euler trail, 534, 535, 556 F,,, 719, 734
Encoding function, 763, 764, 767, 769, Euler’s conjecture (Latin squares), 819 Fo (contradiction), 53
771; see also Algebraic coding theory Euler’s phi function, 394, 395, 689, 747 F[x], 802
Encoding scheme, 610, 611 Euler’s Theorem on congruence, 759, Flx]/(s()), 810
Encryption, 690-693 760 Factor of a polynomial, 802, 804, 805
Encryption function, 759 Euler’s Theorem on connected planar Factor Theorem, 804, 805
Enderton, Herbert B., 189, A-32 graphs, 546-548, 573 Factorial, 6, 7, 215
Endpoint, 660 Eulerian numbers, 193, 217, 218. 304, Factorial order, 293
Energy levels, 486 420 Factorial time complexity, 293
I-8 Index
Factorization of a polynomial, 805 internal states, 320, 321, 327, 371 Ford, Lester Randolph, Jr., 649, 653, 654,
Failure, 161, 178 k-equivalent states, 338, 371 668, 669
Fallacy, 74, 75, 110 k-unit delay machine, 329 Ford-Fulkerson algorithm, 654-657
False assumption, 115 Mealy machine, 333 Foreign Office at Bletchley Park, 333
Fan, 628 minimization process, 371-376, 378 Forest, 581, 639, 641, 642
Fano, Gino, 820, 831 next state, 320 Formal Logic; or, the Calculus of
Feit, Walter, 795 next state function, 320 Inference, Necessary and Probable,
Feller, William, 444, 506, 507 1-equivalent states, 371 118
Fence, 508 one-unit delay machine, 329 Formulario Mathematico, 243
Fendel, Daniel, 119, 120 output, 320-322, 324, 328, 329 Forward edge, 650, 651, 654, 655
de Fermat, Pierre, 243, 244, 705 output alphabet, 320, 321 Foulds, L. R., 562, 575, 576
Fermat’s Last Theorem, 705, 706 output function, 320 Foundations of mathematics, 333
Fermat’s theorem on congruence, 759 pigeonhole principle, 327 Foundations of the Theory of
Ferrers, Norman Macleod, 443 reachability, 338 Probability, 188
Ferrers graph, 435, 443 reachable state, 330 Founder of information theory, 795
Fibonacci, Leonardo, 506 redundant state, 371, 373 Four-color conjecture, 573
Fibonacci generator, 697 reset, 321 Four-color problem, 565, 575
Fibonacci numbers, 193, 215-217, 219, second level of reachability, 338 Fourier, Joseph Baptiste Joseph, 303
246, 442, 447, 457, 458, 463, 468, sequence recognizer, 326, 327, 332 Foxtrot, 226
470, 477, 506, 628 serial binary adder, 323, 324 Fractals, 506
Fibonacci relation, 442, 457, 505 sink (state), 331 Free variable, 88
Fibonacci sequence, 505 starting state, 320, 329 Frege, Gottlieb, 119
Fibonacci trees, 626
state diagram, 321, 324, 327 Frequency of occurrence, 611, 692
Field, 677, 678, 681, 682, 688, 707, 746, state table, 321, 322, 324, 331 Frey, Gerhard, 706
794, 802, 830, 831; see also Finite field
strongly connected machine, 331 Frobenius, Georg, 796
Field theory, 831 Front (ofa list), 598, 599
submachine, 331
Fields (in a record), 694
transfer sequence, 331 Fulkerson, Delbert Ray, 649, 653, 654,
FIFO structure, 598 668, 669
transient state, 330
Filius Bonaccii, 442 Full-adder, 721
transition sequence, 331
Finite affine plane, 820 Full binary tree, 611
transition table, 321
Finite Boolean algebra, 740, 743, 799, Full house, 152
two-unit delay machine, 329
830 Full m-ary tree, 614
Finite strings, 310
Finite field, 799, 803, 806, 811, 812, 817, Function, 99, 175, 186, 211, 247,
Finite three-dimensional geometry, 831
820, 822, 826, 830 252-257, 259-263, 267-271,
Finizio, Norman, 506, 507
Finite function, 247, 284, 302, 332 278-293, 295, 302, 303, 309, 311, 318,
First-degree factor, 805, 806
Finite geometry, 799, 820, 822, 825, 830, 320, 376, 394, 395, 403, 409, 410,
First-in first-out structure, 598
831; see also Affine plane 602, 644, 660, 673, 697-704, 712, 739
Finite group, 795 First level of infinity, 303, A-30 access function, 254
Finite group theory, 831 First level of reachability, 338
Ackermann’s function, 259
Finite integral domain, 682 First-order linear recurrence relations, associative binary operation, 268
Finite language, 314 448, 450 Big-Oh notation, 290
Finite poset, 377 Fisher, R. A., 831
bijective function, 279, 283
Finite projective geometry, 831 Fissionable material, 486 binary operation, 267-269
Finite projective plane, 831 Five-times repetition code, 765, 769; see Boolean function, 712
Finite sample space, 164 also Algebraic coding theory ceiling function, 254
Finite sequence of n terms, A-25 Fixed (invariant), 781, 783, 789 characteristic function, 307
Finite sequence of undirected edges, 351 Fixed order, 597 closed binary operation, 267, 268, 270
Finite set, 124, 125, 186, 280, 287, 344, Fixed point (of a function), 403 codomain, 253, 279, 281, 287
A-23, A-24 Flach, Matthias, 706 commutative binary operation, 268,
Finite slope, 821 Floor function (|x ]), 253, 254, 297, 496, 270
Finite state machine, 309, 319-324, 602 composite function, 280, 281
326-333, 337, 338, 371-376, 378, Flow in a transport network, 644-654, composition of functions, 278, 280,
682, 720 656. 662, 663 282
arc, 321, 329 Flow of current, 536 constant function, 261
definition, 320 Flowchart, 203, 204, 349 decoding, 767
directed edge, 321 Folding method (for hashing), 694 definition, 252
distinguishing string, 374 Fontane, Johnny, 186 distance function, 766, 767
E, 371 Vr, 88, 124 domain, 175, 253, 257, 270, 281, 287
Ej, 371 For all x, 88 dominance, 292-294
Ex, 371, 374 For any x, 88 encoding, 763, 764, 767, 769, 771, 773
equivalent machines, 327 For at least one x, 88 equality, 279
equivalent states, 338 For each x, 88 Euler’s phi function, 394, 395, 689
first level of reachability, 338 For every x, 88 exponential, 402, A-1, A-5
input, 320, 322, 324, 329 For some x, 87, 88 extension, 257
input alphabet, 320, 321 Forbidden positions, 406, 408 f!, 283
Index 1-9
finite function, 247, 284, 302 Fundamental Theorem of Arithmetic, convolution of sequences, 430, 431,
finite sequence of n terms, A-25 193, 237-240, 244, 254, 265, 275, 440
fixed point, 403 314, 342, 394, 703, 704, A-29 definition, 418
floor function, 253, 254, 297 distributions, 415-417
function complexity, 247 go f, 280 exponential generating functions,
function dominance, 290-292, 294, g dominates f, 290 436-439, 443
498 g dominates f on S, 498 geometric series, 419
greatest integer function, 253, 297 G, 523 in solving recurrence relations,
hashing function, 673, 694, 695, 708 G4, 549 482-487
identity function, 279 G?, 626 moment generating function, 443, 444
image of an element, 253 G — e (e an edge), 522 nonlinear recurrence relation, 487-490
image of a set, 256, 257 G — vp (va vertex), 522 ordinary generating function, 436
incompletely specified Boolean |G|, 746 partitions of integers, 432-435
function, 732 Galileo, 303 power series, 417
infinite sequence, A-25 Gallian, Joseph A., 707, 708, 795, 796 rook polynomial, 416
injective function, 255 Gallier, Jean H., 119, 120 summation operator, 440-442
inverse function, 278, 283, 285, A-9 Galois, Evariste, 707, 794, 795, 813, 830, table of identities, 424
invertible function, 282-285, 287 831 Generator matrix, 769, 771, 772, 774,
logarithmic, A-1, A-5 Galois field, 813, 818 777, see also Algebraic coding theory
mapping, 252 Galois theory, 707, 795, 831 Generator of a cyclic group, 755
monary operation, 267 Gambler’s ruin, 510 Generic, 110
monotone increasing function, 494, Games of chance, 188 Genesereth, Michael R., 119, 120
495, 500, 501, 503, 608, 609 y(G), 577 Geometric progression, 447
next state function, 320, 682 Gardiner, Anthony, 795, 796 Geometric random variable, 430, 446
notation, 253 Gardner, Martin, 39, 42, 507, 795, 796 Geometric series, 419, 423, 428, 476
14,279 Garland, Trudi Hammel, 506, 507 Geometrie die Lage, 622
one-to-one correspondence, 279, 303 Garrett, Paul, 693, 708, 795, 796 Geometry, 123, 222, 242, 506, 794, 795
one-to-one function, 255—257, 409, Gate, 720 Gerasa, 707
410 Gating network, 309, 719-722, 731 Germain, Sophie, 705
onto function, 260-263, 265 Gauss, Carl Friedrich, 377, 705, 707 Gersting, Judith L., 333, 334
order (of a function), 290, 292, 293 gcd (greatest common divisor) GF, 813
order-preserving function, 366, 509 for integers, 231-236, 240, 394, 453, GF(n), 821, 824, 827, 828
output function, 320, 682 454, 688, 734, 737 GF(p"), 830
partial function, 260 for polynomials, 807, 808 G F(p'), 813, 818
phi function, 394, 395 General solution of a homogeneous Gilbert bound, 773; see also Algebraic
powers of a function, 282 recurrence relation, 468 coding theory
pred (predecessor), 307 General solution of a nonhomogeneous Gill, Arthur, 333, 334
preimage of an element, 253 recurrence relation, 471 Giornale di Matematiche, 820
preimage of a set, 285-287 General solution of a second-order linear Global result, 632, 639
projection, 270, 271 homogeneous recurrence relation with gib (greatest lower bound), 363, 709
range, 253 constant coefficients, 456 Gédel, Kurt, 187
recursive function, 453 Generalizations of the principle of Gédel’s proof, 188
restriction, 257 inclusion and exclusion, 397-401 The Godfather, 186, 692
scattering function, 694, 708 Generalized associative law for A, 212 Golay, Marcel J. E., 761, 795, 796
self-dual Boolean function, 744 Generalized associative law for U, 213 Goldberg, Samuel, 506, 507
sequence, 255 Generalized associative law for a group, Golden ratio, 457, 469, 506
space complexity function, 290 746 Golomb, Solomon W., 796
succ (successor), 307 Generalized associative law of addition Gone with the Wind, 47, 48, 52
surjective function, 260 of real numbers, 214-216 Gopolan, K. Gopal, 743
switching function, 712 Generalized associative law of Gorenstein, Daniel, 795, 796
symmetric Boolean function, 744 multiplication of real numbers, 214, Graceful (labeling of a tree), 627, 628
time complexity function, 290, 215 Graff, Michael, 795
297-299 Generalized associative laws in a ring, Graham, Ronald Lewis, 304, 305, 506,
trunc(ation), 254 674 507, 642, 667-669
unary operation, 267, 268 Generalized Binomial Theorem, 422 Grandparent, 593
Function complexity, 247 Generalized DeMorgan’s laws, 146 Graph coloring, 564-573, 575
Function composition; see Composite Generalized distributive laws in a ring, Graph isomorphism, 523, 526-528, 699
function 674 Graph planarity, 352, 615
Function dominance, 292, 294, 341, 498 Generalized intersection of sets, 146 Graph theory, 324, 349-354, 378, 379,
Function inverse; see Inverse of a Generalized union of sets, 146 395, 396, 411, 513-579, 615-621,
function Generated recursively, A-26 624, 631, 632, 657, 659-665, 667,
Fundamental conjunction, 715-718, 721, Generates, 753 730; see also Matching theory,
723, 724, 732, 738 Generating function, 303, 415-445, 452, Transport networks, Trees
Fundamental disjunction, 717, 718 482-487, 489, 505, 783, 790, 791 adjacency list, 379
Fundamental Theorem of Algebra, 356 calculational techniques, 418-431 adjacency list representation, 378, 379
I-10 Index
adjacency matrix, 352, 539, 600 distance, 518, 626 ladder graph, 572, 577, 626, 627
adjacent from, 349, 514 dominating set, 577, 730 length of a cycle, 351
adjacent to, 349, 514 domination number, 577 length of a path, 632
adjacent vertices, 349 dual graph, 549, 551 length of a walk, 515
algorithm for articulation points, 619, edge-disjoint paths, 658 line graph, 578, 670
620 edge set, 349, 514 loop, 349, 351, 353, 354, 514, 551
arc, 349, 514 edges, 349, 514 loop-free graph, 351, 515
articulation point, 615-621, 624 electrical networks, 551, 573, 574 mapmaker’s problem, 551
associated undirected graph, 350, 353, elementary subdivision, 542, 543 maximal independent set, 564, 627
517 embedding, 540, 545 mesh graph, 532
£(G), the independence number of G, Euler, Leonard, 378 minimal covering of a graph, 577
564, 666 Euler circuit, 534, 556 minimal dominating set, 577, 730
biconnected component, 615, Euter’s Theorem for Connected Planar multigraph, 516, 518, 533
619-621, 624 Graphs, 546-548, 573 multiplicity (of an edge), 518
biconnected graph, 615 Euler trail, 534, 556 n-cube, 532, 541, 542
binary tree, 488, 595, 600 fan, 628 nodes, 349, 514
bipartite graph, 541, 542, 558, 659, Four-color problem, 565, 575 nonplanar graph, 540, 541, 543, 547
660, 662-665, 668 G, 523 null graph, 523
bridge, 550 G4, 549 od(v), 535
x(G), the chromatic number of G, G?, 626 w(G), the clique number of G, 578
565, 621 G — e (e an edge), 522 one-factor, 666
chromatic number, 413, 565, 615, 621 G — v (v a vertex), 522 one-terminal-pair-graph, 552
chromatic polynomial, 413, 564-571, y(G), the domination number of G, open walk, 515
574 S77 origin (of an edge), 349, 514
circuit, 516, 534, 551 graceful labeling of a tree, 627, 628 out degree (of a vertex), 535
clique, 578 graph coloring, 564-573, 575 outgoing degree (of a vertex), 535
clique number, 578 graph isomorphism, 523, 526-528, P(G, A), 566-568, 570
closed path, 351, 516 699 path, 351, 516, 567
closed walk, 515, 516, 546, 549 grid graph, 532 pendant vertex, 530, 549, 583, 584
cocycle, 564 Hamilton cycle, 556-562, 573, 574 perfect matching, 666
color-critical graph, 573, 622 Hamilton path, 556-561, 573 Petersen graph, 543, 566, 574
comb graph, 577 Hasse diagram, 358-361 planar graph, 540-553
complement of a graph, 523 Herschel graph, 564, 566 planar-one-terminal-pair-graph, 552
complement of a subgraph in a graph, historical development, 574 planarity of graphs, 352, 615
586 homeomorphic graphs, 542-544 Platonic solids, 547-549, 556
complete bipartite graph, 541] hypercube, 531-533, 541, 542, 557 Polya’s theory of enumeration, 574
complete directed graph, 559 id(v), 535 precedence graph, 350
complete graph, 352, 354, 523 in degree (of a vertex), 535 proper coloring of a graph, 565-568,
components, 352, 517, 567 incidence matrix, 539 570
connected graph, 351, 517 incident, 514 QO, 532, 542, 667
covering of a graph, 577 incoming degree (of a vertex), 535 regions (in a planar graph), 544
covering number, 577 independence number, 564, 666 regular graph, 531
cut-set, 549, 551 independent set of vertices, 564, 627 rooted binary tree, 488
cycle, 351, 516, 551, 552, 624 index list, 379 rooted ordered binary tree, 488, 489
d(a, b), 626, 632 induced subgraph, 522, 619 round-robin tournament, 559
Decomposition Theorem for infinite region, 545 self-complementary graph, 529, 576
Chromatic Polynomials, 568 Instant Insanity, 524 Seven Bridges of K6nigsberg, 513,
deficiency, 664 intersection of graphs, 570 519, 533, 535
deficiency of a graph, 664 isolated vertex, 349, 352, 514, 613 source (of an edge), 349, 514
deg(R), 546 isomorphic graphs, 526 spanning subgraph, 521, 582, 640
deg(v), 530 Kn, 541 spokes, 519, 520
degree of a region, 546 Kn, 352, 523 square of a graph, 626
degree of a vertex, 530 KF, 559 strongly connected graph, 351, 539
5(G), 664, 665 Ks, 540-543, 547 subgraph, 521
digraph, 349, 350, 514 K33, 542, 543, 547 terminals, 552
Dijkstra’s Shortest-Path Algorithm, «(G), the number of components of terminating vertex, 349, 514
631-638 G, 517, 549, 615 terminus (of an edge), 349, 514
directed cycle, 351, 516 k-regular graph, 531 tournament, 559
directed edge, 321, 349, 351, 514, 646, king, 563 trail, 516
650 kite, 628 Traveling Salesman Problem, 562, 574
directed Euler circuit, 535, 536 Konigsberg, 513, 519, 533, 535 tree, 573
directed graph, 337, 344, 349, 514 Kuratowski’s Theorem, 543, 544, 574 trivial walk, 515
directed path, 353, 516 L(G), 578, 670 2-isomorphic graphs, 555
directed walk, 516 labeled directed graph, 324 undirected edge, 349, 514
disconnected graph, 352, 517 labeled multigraph, 524 undirected graph, 350, 351, 514
Index I-11
union of graphs, 570 group of units, 747 Hamilton path, 556-561, 573
unit-interval graph, 520 homomorphism, 752 Hamming, Richard Wesley, 761, 766,
unity graph, 542 infinite order, 746 795, 796
vertex, 349 invariant element under a permutation, Hamming bound, 773; see also Algebraic
vertex degree, 530 781, 783 coding theory
vertex set, 349, 514 isomorphism, 753 Hamming code, 778; see also Algebraic
vertices, 349, 514 kernel of a homomorphism, 797 coding theory
W,,, 520, 572 Klein Four group, 755 Hamming matrix, 778; see also
walk, 515, 516 Lagrange’s Theorem, 758 Algebraic coding theory
weight of an edge, 631 left-cancellation property, 747 Hamming metric, 767; see also Algebraic
weighted graph, 631 left coset, 757 coding theory
wheel graph, 519, 520, 572 length of a cycle, 780 Handshakes, 480
Gray, Frank, 188 multiples of group elements, 748 Hanson, Denis, 412
Gray code, 128, 129, 188, 533, 557, 564 nonabelian group, 749 Harary, Frank, 573-576, 623, 625
Greatest common divisor nontrivial subgroup, 748 Hardware considerations, 333, 378
for integers, 231-236, 240, 394, 453, normal subgroup, 795, 831 Hardy, Godfrey Harold, 244, 412
454, 688, 734, 737 order of a group, 746 Harmonic numbers, 193, 202, 209, 215,
for polynomials, $07, 808 order of a group element, 754 246
Greatest element (in a poset), 363 Polya’s method of enumeration, Hartsfield, Nora, 573, 576
Greatest integer function (|x |), 253, 297, 779-793 Harvard University Computation
391, 496, 602 powers of group elements, 747 Laboratory, 742, 743
Greatest lower bound (glb), 363 product of disjoint cycles, 780, 731, Hashing function, 673, 694, 695, 708
Greedy algorithm, 632, 638-641, 667 786 Hasse, Helmut, 377
Gregory, Duncan, 186 proper subgroup, 748 Hasse diagram, 358-361, 377, 476, 533,
Grid, 45 quotient group, 831 696, 736-739
Grid graph, 532 right-cancellation property, 747 Heap, 637, 638
Griess, Robert, Jr., 795 right coset. 757 Heap implementation, 642, 643
Group, 745 rigid motions of a cube, 791 Heath, Thomas Little, 41, 42
Group acting on a set, 782, 785, 792 rigid motions of an equilateral triangle, Heawood, Percy John, 565
Group action, 783 749, 750 Height of a rooted tree, 601
Group code, 773, 774, 776, 777: see also rigid motions of a regular hexagon, Hell, Pavol, 642, 667-669
Algebraic coding theory 788 Henle, James M., 189, A-32
Group homomorphism, 752, 753, 774 rigid motions of a regular tetrahedron, Herschel graph, 564, 566
Group isomorphism, 753, 755 792, 793 Herstein, Israel Nathan, 795, 797
Group of permutations, 749, 750, 781, rigid motions of a square, 750, 780 Hexadecimal notation, 226
782, 830 RSA Cryptosystem, 759-761 Hexagon, 135, 788
Group of rigid motions simple group, 795 Hierarchy of operations, 460, 590
of a cube, 791 Sn, 750, 794 High-energy neutrons, 486
of an equilateral triangle, 749, 750 solvable group, 830 Hilbert, David, 119, 188, 259, 333, 706
of a regular hexagon, 788 stabilizer, 785 Hilbert decision problem, 333
of a regular tetrahedron, 792, 793 subgroup, 748 Hill, Frederick J., 742, 743
of a square, 750, 780 symmetric group, 750 Hindu mathematicians, 243
Group of transformations, 794, 795 trivial subgroup, 748 Hindu-Arabic notation, 442
Group of units, 747 Grundbegriffe der History of enumeration, 4]
Group theory, 745-798 Wahrscheinlichkeitsrechnung History of graph theory, 574
abelian group, 745, 746 (Foundations of the Theory of Hodges, Andrew, 333, 334
algebraic coding theory, 773-777 Probability), 188 Hoggatt, Verner E., Jr., 506, 507
center, 751 Grundlagen der Mathematik, 119 Hohere Algebra, 377
chain of subgroups, 830 Gruppentheoretischen Studien H, 794 Hohn, Franz E., 333, 334, 778, 796
commutative group, 745 Guthrie, Francis, 565, 573 Homeomorphic graphs, 542-544
coset, 757 Guthrie, Frederick, 565 Homogeneous recurrence relations, 450,
cycle, 780, 781 Guy, Richard K., 506, 507 456, 482
cyclic group, 753-756, 758 Homomorphic image (rings), 698
decomposition of a permutation, 781 Hf, (the nth harmonic number), 202 Homomorphism of groups, 752
definition of a group, 745 Haken, Wolfgang, 565, 573, 575 Homomorphism of rings, 698
direct product of groups, 751 Half-adder, 720, 721 Honsberger, Ross, 506, 507
Euler’s Theorem on Congruence, 759, Half-open interval, 134 Hopcroft, John E., 333, 334, 378, 506,
760 Hall, Marshall, Jr., 412, 831, 832 507, 574, 575, 623, 624, 642, 667,
Fermat’s Theorem on Congruence, 759 Hall, Philip, 660, 663, 668 668, 708
fixed (invariant), 781, 783 Hall’s Marriage Condition, 664 Hopper, Grace, 623
generator of a (sub)group, 754 Halmos, Paul R., 189, A-32 Horizontal class, 822
group acting on a set, 782, 785, 792 Hamilton, Sir William Rowan, 186, 556, Horner’s method, 301
group of permutations, 749, 750, 781, 565, 573, 574 Horowitz, Ellis, 641, 642, 668, 669
782, 830 Hamilton cycle, 556-559, 561, 562, 573, Huffman, David Albert, 333, 334, 378,
group of transformations, 794, 795 574 611, 624, 625
I-12 Index
Huffman tree, 613, 614 Incident, 514 Integer-valued function, 254
Huffman’s construction for optimal trees, Inclusive or, 48 Integers, 113, 114, 133, 193, 242
612-614 Incoming degree of a vertex, 535 Integers modulo n, 686-696
Hungarian method, 668 Incompletely specified Boolean function, Integral domain, 677, 678, 681, 682, 801,
Huygens, Christiaan, 42, 188 731, 732 802
Hydrocarbon, 581, 584 Increment, 689 Intel Corporation, 5
Hydrogen, 584 Independence for three events, 171, 172 Internal states, 320, 321, 327, 337, 371
Hypercube, 531-533, 541-544, 557, 667 Independence number of a graph, 564, Internal vertices, 588
Hypothesis, 48, 51, 53, 67, 70 666 Internet, 12, 13, 575
Hypothesis Testing, 188 Independent, 786 Internet address, 12
Independent events, 154, 155, 158, 161, Internet security, 222
i(=/-1), 811 166, 170, 174, 179, 182, 428, 430, 762 Internet standard regarding reserved
1 Principii di Geometrica, 377 Independent in pairs, 172 network numbers (STD2), 12
I, 348, A-16 Independent set of vertices, 564, 627 Intersection of graphs, 570
(i, j)-entry of a matrix, A-11 Independent solutions; see Linearly Intersection of sets, 136, 138, 214
Icosahedron, 548 independent solutions Introductio in Analysin Infinitorum, 443
id(a), 644, 646 Independent switches, 64, 65 Invalid argument, 74, 75, 82, 83, 109
id(v), 535 Indeterminate, 799 Invariant (element under a permutation),
Ideal, 684, 700, 706 Indeterminate form, A-1 781, 783, 784, 786, 787, 789
Idempotent element (in a ring), 697 Index, 145 Inventory, 786
Idempotent Law of Addition, 718, 724, Index list, 379 Inverse (under addition), 278
726, 732 Index of a product, 239 Inverse (under multiplication), 278
Idempotent Law of Multiplication, 717 Index ofa summation, 17 Inverse function, 278, 283, 285, A-9
Idempotent Laws Index set, 145, 366, 367 Inverse of an implication, 62, 63, 82,
for a Boolean algebra, 735 Indirect method of proof, 82 92-94, 99
for Boolean functions, 713 Indirect proof, 115 Inverse laws
for Boolean variables, 713 Induced subgraph, 522, 619 for a Boolean algebra, 734
for logic, 58 Induction, 534, 545 for Boolean functions, 713
for set theory, 139, 147 Induction hypothesis, 196, 198, 199, 201, for Boolean variables, 713
Identical containers, 493 203-205, 207, 208, 214, 216, 238, for logic, 58
Identity element for a binary operation, 298, 315, 317, 805 for set theory, 139
269, 270 Inductive proof, 213 Inverses in a group, 745, 794
Identity element for concatenation, 311 Inductive step, 195-199, 201-204, 206, Inverses under + in a ring, 673
Identity element for + in a ring, 673 207, 212-215, 218 Inverter, 719, 720, 722
Identity element of a group, 745, 794 Infeld, Leopold, 831, 832 Invertible function, 282-285, 287, A-23
Identity for the addition of real numbers, Infinite area, 545 An Investigation in the Laws of Thought,
103 Infinite cardinal numbers, A-31 on Which Are Founded the
Identity function, 279, A-24 Infinite countable set, A-26 Mathematical Theories of Logic and
Identity laws Infinite order (for a group), 746 Probability, 119
for a Boolean algebra, 734 Infinite region, 545 An Investigation of the Laws of Thought,
for Boolean functions, 713 Infinite sample space, 164 186, 711, 742
for Boolean variables, 713 Infinite sequence, A-25, A-26 Irreducible polynomial, 807, 810, 811,
for logic, 58 Infinite set, 124, 186, 189, 280, 304, 830
for set theory, 139 A-23-A-26, A-28, A-30 Irreflexive relation, 344
Identity transformation, 791, 792 Infinite slope, 821, 822 Irrational numbers, 356, A-2
If and only if, 48 Infix notation, 251, 591 Irrational power, A-3
If p, then g, 51 Information retrieval, 694 Is approximately equal to (=), 7
If p, then g, else r, 5] Information theory, 795 Isobutane, 584
{f-then decision structure, 51 Initial condition(s), 448, 456 Isolated fundamental conjunction, 724
If-then statement, 62 Initial flow, 652, 654 Isolated product term, 724
If-then-else decision structure, 51 Initialization, 636, 639, 642 Isolated vertex, 349, 352, 359, 514
Iff, 48 Injective function, 255 Isomers, 573, 796
Ignition system, 5 Inorder, 594 Isomorphic Boolean algebras, 739, 740
Image of an element, 253, 255 Inorder traversal, 594 Isomorphic copy, 809
Image of a set, 256, 257 Input alphabet, 320, 321 Isomorphic finite fields, 813
Implication, 48, 51-53, 56, 61-63, 67, Input (for a finite state machine), 309, Isomorphic graphs, 526, 527, 542, 543,
69, 70, 76, 83, 89, 104, 105, 124 319, 320, 322, 324, 329 549
Implicit quantification, 104 Input (for a gate), 719, 720 Isomorphic groups, 753, 755
Implicit quantifiers, 90 Input (for an algorithm), 253, 289 Isomorphic rings, 698, 699, 704
Implicit restriction, 218, 317 Input (function), 253 Isomorphic trees, 582, 583
Implies, 48 Input string, 321, 322, 327, 330, 331 Isomorphism of
In degree ofa vertex, 535 Instant Insanity, 524, 525 Boolean algebras, 737, 740
Incidence, 123 Integer division, 222 fields, 810-811
Incidence matrix for a design, 832 Integer solutions, 235, 392, 415-417, finite fields, 813
Incidence matrix for a graph, 539 427, 433 graphs, 523, 526-528
Index I-13
groups, 753 Kolmogorov, Andrei Nikolayevich, 159, Laws of logic, 58-65, 74, 77, 83, 113,
rings, 698 188, 189 139, 140, 211, 713, 735
trees, 596 K6nig, Dénes, 573 Laws of set theory, 139, 144, 163, 168,
Itantum processor, 5 KG6nigsberg, 378, 513, 518, 533-535, 573 169, 713, 735
Iteration, 634-637, 639-642, 652, 653, Koshy, Thomas, 506, 507 Lay, David C., A-2]
656 Kronecker, Leopold, 242, 705, 795 Icm (least common multiple), 236, 240,
Iterative algorithm, 477, 478 Kruskal, Joseph Bernard, 638, 667, 669 391, 734, 737, 739
Iverson, Kenneth, 623 Kruskal’s algorithm, 639-64 | Le Probleme des rencontres, 411
Iwasawa theory, 706 Kummer, Ernst, 706 Leading coefficient, 799, 806
Kuratowski, Kasimir, 543, 573 Leaf, 588, 591, 593, 596, 597, 600, 601,
Java, 4, 13, 345 Kuratowski’s Theorem, 543, 544, 574 611, 612
Jean, Roger V., 506, 507 Least common multiple, 236, 240, 391,
Jefferson, Thomas, 54 £o3, 827, 828 734, 737, 739
Jiushao, Qin, 707 L(G), 578, 670 Least element (in a poset), 363
Johnson, D. B., 642, 668, 669 Ly, (the nth Lucas number), 216 Least element (well-ordered set), 194
Johnson, Lyle, 623 Label, 633-636 Least significant, 731
Johnson, Selmer Martin, 506, 507 Labeled complete binary tree, 610 Least significant bit, 323, 324
Jordan, Marie Ennemond, 622 Labeled directed graph, 324 Least upper bound (lub), 363
Jiinger, M., 562, 576 Labeled graph, 562, 634, 636 Leaves of a plant, 505
Juxtaposition, 301, 311 Labeled multigraph, 524, 525 Left branch, 488
Labeled tree, 586, 611 Left-cancellation property (in a group),
Km n, 541 Labeled trees on n vertices, 623 747, 757
Kn, 352, 523 Labyrinth, 623 Left child, 590, 594, 610, 611
K7, 559 Ladas, Garasimos, 506, 507 Left children, 594, 595
Ks, 540, 543 Ladder graph, 572, 577, 626, 627 Left coset, 757
K3.3, 542, 543 Lagrange, Joseph-Louis, 510, 752, 794 Left subtree, 590, 592, 594, 596, 614
k-ary operation, 306 Lagrange’s Theorem, 758 Legendre, Adrien-Marie, 705
k-equivalence, 371, 373 A (the empty string), 310, 323 Lehman, John, 623
k-equivalent states, 338, 371, 372 A (for a design), 826 Lehmer, Derrick H., 689
k-regular graph, 531 1”, 567 Leibniz, Gottfried Wilhelm, 118, 302
k-unit delay machine, 329 Axy, 825 Leiserson, Charles E., 504, 507, 624,
«(G), 517, 549, 615 Lamé, Gabriel, 458, 505, 705 625, 638, 643, 654, 667
Karnaugh, Maurice, 722, 742, 743 Lamé’s Theorem, 459 Lemma, 222
Karnaugh map, 722-726, 729, 731, 732 Landau, Edmund, 304 Length of a
don’t care conditions, 731—733 Landau symbol, 304 chain, 381
Karp, Richard M., 653, 654, 669 Language, 211, 309, 312-317, 328, 333, cycle (in a graph), 351
Katz, Nick, 706 338 cycle (in group theory), 780
Katz, Victor J., 189 de Laplace, Pierre Simon, 150, 188, 443 path, 632
Kempe, Sir Alfred, 565 Largest possible block of adjacent 1’s, string, 18, 310-312
Kepler, Johannes, 505 726 walk, 515
Kernel of a group homomorphism, 797 Larney, Violet Hachmeister, 244, 707, Lenstra, Arjan, 795
Kernel of a ring homomorphism, 704 708, 795, 796, 831, 832 Lenstra, J. K., 562, 575, 576
Kershenbaum, A., 642, 668 Larson, Harold J., 444 Leonardo of Pisa, 442, 505
Key, 295, 302, 501-503, 691-695, 759, Last nonzero remainder, 232, 235, 808 Lesniak, Linda, 573, 576
760 Last-in-first-out structure, 490 Less than [for (0, 1)-matrices], 347
Key, J. D., 796 Latin square (in standard form), 816, 817 Less than or equal to, 364, 377
Khan, Genghis, 707 Latin squares, 799, 815-820, 822-824, Level, 588, 589, 593, 597, 607, 611
Khowéarizm, 242 831 Level number, 588, 601, 602, 612
Kimberling, Clark, 707, 708 Lattice, 364, 377 Levels of gating, 722
King (of a tournament), 563 Lattice point, 277 Levels of infinity, 303
Kings (on a chessboard), 510 Law of the Double Complement LeVeque, William Judson, 244
Kinney, John J., 175, 189 for a Boolean algebra, 736 Lewis, Harry R., 333, 334
Kirchhoff, Gustav, 573, 581, 622 for Boolean functions, 713 Lewis, James T., 305
Kirkman, Thomas P., 562 for Boolean variables, 713 Lexicographic order, 589, 593
Kitab al-jabr w’al muquabala, 242 for set theory, 139 Leyland, Paul, 795
Kite, 628 Law of Double Negation, 58, 59, 61, 62 L’Hospital’s Rule, A-1
Kleene, Stephen Cole, 119, 120, 315 Law of the syllogism, 72, 73, 78, 108, Liber Abaci, 442, 505
Kleene closure (of a language), 315, 322 127 LIFO structure, 490
Klein, Felix, 795 Law of Total Probability, 169, 170, 173 limysa f(x) = L, 99, 100
Klein Four group, 755 Law of Total Probability (Extended limy oo fn = L, 103
Kneiphof, 533 Version), 173 Limit of a real-valued function, 99, 100
Knuth, Donald Ervin, 304, 305, 378, 506, Lawler, Eugene L., 562, 575, 576, 667, Limit of a sequence of real numbers, 103,
624, 625, 704, 708 669 A-3
Koch snowflake curve, 475 Laws for Boolean functions, 713, 735 Line, 123
Kohavi, Zvi, 333, 334, 378 Laws for Boolean variables, 713 Line at infinity, 828
I-14 Index
Line graph, 578, 670 Loop, 349, 351, 353, 354, 358, 488, Mathematical definition; see Definition
Line in AP(F), 821, 826-828 514-516, 525, 549, 551, 582, 640 Mathematical induction, 84, 193, 194,
Line in a finite projective plane, 827, 828 Loop-free graph, 351, 352, 396, 515, 200, 203, 206, 214, 215, 243, 244,
Linear algebra, 466, 624 533, 581, 582, 584, 585, 615-619, 317, 674, 746, 805: see also Principle
Linear arrangement, 6, 7, 9, 10 624, 631, 639-642, 644, 667 of Mathematical Induction, Alternative
Linear combination (atoms), 739 Lord Byron, 242 form of mathematical induction
Linear combination (integers), 221, Lovasz, Laszlo, 573 Mathematical logic, 118, 119, 711
232-234 Lovelace, Augusta Ada Byron, 243 Mathematical theorems, 104
Linear combination (polynomials), 808 Lovelace, Countess of, 242 Mathematical Treatise in Nine Sections,
Linear complexity, 299 Low (x), 619-621 707
Linear congruence, 688 Low-energy neutrons, 486 Mathematics of finance, 473
Linear congruential generator, 689, 690 Lower bound, 363 Matrix, 254, A-11—-A-2]
Linear factor of a polynomial, 805 Lower bound (for probability), 183, 184 addition of matrices, A-12
Linear linked lists, 694 Lower limit in product notation, 239 additive identity, A-13
Linear order, 293, 359 Lower limit in sum notation, 17 additive inverse, A-13
Linear recurrence relation, 449, 506 Lozansky, Edward, 304, 305 associative law of multiplication, A-16
Linear search, 296, 297, 302 lub (least upper bound), 363, 709 column matrix, A-11
Linear time complexity, 293, 359 Lucas, Francois Edouard Anatole, 468, column vector, A-11
Linearly independent solutions, 456, 464 505 commutative law of addition, A-12
Linearly ordered poset, 359 Lucas numbers, 193, 216, 217, 220, 246, definition, A-11
Linked lists, 378 447, 506 determinant, A-17—A-2]
List (in a relational data base), 271 Lukasiewicz, Jan, 591 distributive laws of matrix
Literal, 715, 716, 722-726 multiplication over matrix addition,
M2(C), M2(Q), M2(R), M2(Z), 674 A-21
Liu, C. L., 42, 412, 444, 506, 507, 535,
543, 551, 573, 574, 576, 624, 625, m Xn matrix, A-11 distributive law of scalar
667-669, 783, 792, 796, 797, 831, 832 (m + 1, m) parity-check code, 764, 765 multiplication over matrix addition,
Machine language, 226 A-13
Lloyd, E. K., 574, 575
Machine language instructions, 302 equality, A-12
Local address, 12
m-ary tree, 600 expansion by minors, A-20
Local result, 632, 638
Maclaurin, Colin, 304 (i, j)-entry, A-11
Lockett, J. A., 562, 575
Maclaurin series, 304, 402, 422, 436, matrix product, A-14
Logarithmic function, A-1, A-5
437, 496 matrix sum, A-12
Logarithmic order, 293
Maclaurin series for e*, 402 minor, A-19
Logarithmic time complexity, 293
MacWilliams, F. Jessie, 795, 797 multiplicative identity, A-16
Logic, 47-121
Magnanti, Thomas L., 562, 575, 576, multiplicative inverse, A-16
basic (logical) connectives, 47-53, 56
637, 643, 654, 668 product, A-14
Laws of Logic, 58-65
Main memory, 5 row matrix, A-!1
logical equivalence, 56-61, 83, 95
Majority rule, 765; see also Algebraic row vector, A-11
logical implication, 69-73, 75, 89, 91, coding theory scalar product, A-12-A-14
95 Manohar, R., 704, 708 square matrix, A-11
logically equivalent statements, 56, 58, Maple code, 420, 477 sum, A-12
61-64, 74, 91, 97, 98 Mapmuker’s problem, 551 system of linear equations, A-18
negation of quantified statements, 96, Mapping, 252; see also Function zero element, A-13
97, 99, 100 Marriage condition, 664 Matrix multiplication algorithm, 507
Principle of Duality, 59 Massasauga, 9 Matrix product, A-14
proof, 105-116 Master theorem, 504, 505, 507 Matrix rings, 674, 705
quantifiers, 86-100, 103-116 Matches, 411 Matrix sum, A-12
Rules of Inference, 67-84, 86 Matching, 659, 660, 667, 668 Maurocylus, Francesco, 244
statements (and connectives), 47-49 Matching theory, 659-668 Max, 240
Substitution Rules, 60, 61 assignment problem, 659 Max-Flow Min-Cut problem, 649
Table of Rules of Inference, 78 complete matching, 660-664 Max-Flow Min-Cut Theorem, 649, 652
truth tables, 49, 52, 53, 56-58, 60-62 deficiency of a graph, 664 Maximal biconnected subgraph, 615
Logic gate, 719 deficiency of a set of vertices, 664 Maximal chain, 381
Logic network, 719, 720 6(G), 664, 665 Maximal element (of a poset), 362
Logical connectives; see Basic Hall’s marriage condition, 664 Maximal flow, 645, 647
connectives matching, 659, 660, 667, 668 Maximal independent set of vertices,
Logical equivalence, 56-61, 83, 95 maximal matching, 664, 665, 668 564, 627
Logical implication, 69-73, 75, 89, 91, one-factor, 666 Maximal matching, 664, 665, 668
95 perfect matching, 666 Maxterm, 717, 718, 727
Logically equivalent open statements, 92 system of distinct representatives, 663, Maybee, John S., 575
Logically equivalent statements, 56, 58, 668 Mazur, Barry, 706
61-64, 74, 91, 97, 98 The Mathematical Analysts of Logic, McAllister, David F., 333, 334, 378, 507,
Logically implies, 69, 92 Being an Essay towards a Calculus of 508
London Mathematical Society, 565 Deductive Reasoning, 118 McCluskey, Jr., Edward J., 742, 743
Long division of polynomials, 803 Mathematical axioms, 113 McCoy, Neal H., 707, 708, 831, 832
Index 1-15
Mealy, George H., 333, 334 Minimization process algorithm, 372, Multiplicity of a root, 805
Mealy machine, 333 373 Multiplier, 689
Mean, 177, 180, 183 Minimum capacity, 647, 648, 652 Multiset, 518
Mean value, 177 Minimum cut, 648, 649, 652-654, 656 Murty, U. S. R., 573, 575, 668
Measure of central tendency, 177 Minimum distance between code words, Mutually disjoint events, 159
Measure of dispersion, 177 767-769, 771, 773, 774 Mutually disjoint sets, 137, 148
Member (ofa set), 123 Minimum weight, 777 Mutually independent in pairs, 172
Membership, 123 Minimum weight of nonzero code words,
Membership tables, 143, 144, 146 774 N, 133
Mémoire sur les conditions de Minor, A-19 ("), 15, 21, 41, 42, 436
résolubilité des équations par Minsky, Marvin, 333, 334 (|"),n > 0, 422
radicaux, 794 Minterm, 716, 717, 726, 732, 738
Memory, 5 Mirsky, Leon, 668, 669 (,, nymyoum)? 23
Mitchell, Margaret, 47, 48, 52 nt, 6,215
Memory cell, 5, 225
Mixed strategy, 768 nt, Stirling’s approximation formula, 304
Memory location, 369, 378, 694
Mobius inversion formula, 412 n choose r, 15
Mendelson, Elliott, 119, 120
Mod, 234, 454, 689-695, 701, 702, 759, n factorial, 6
Merge algorithm, 608
760 n-butane, 584
Merge sort, 605-609, 641
Mod n, 702 n-cube, 532
Merge sort algorithm, 496, 608
Modular congruence, 694, 707 n-dimensional hypercube, 532
Merging process, 607
Modular exponentiation, 693, 694, 760 n-fold product, 248
Mesh graph, 532
Modulo relation, 337 (n, m) block code, 764; see also
Messages, 763, 769, 777, 778; see also Algebraic coding theory
Algebraic coding theory Modus ponens, 70, 73-75, 78, 108, 109
Modus tollens, 73, 74, 76, 78, 108, 109 n-tuple, 248
Methane, 792
Moment generating function, 443, 444 v, 320
Method of affirming, 70, 71
Monary operation, 138, 267, 733 Nand (connective), 66
Method of contradiction, 114, 115
Monic polynomial, 807 NAND gate, 727, 728
Method of contraposition, 115
Monma, C. L., 562, 575, 576 Napier, John, A-6
Method of denying, 73
Monotone increasing function, 259, 494, Natural logarithm, 284, A-6
Method of exhaustion, 106
495, 500, 501, 503, 608, 609 Natural numbers, 133
Method of generating functions, 482-487 Natural position, 402
Montgomery, Hugh L., 243, 244, 444,
Method of infinite descent, 244
445, 708 Nazi cipher, 333
Method of proof, 193 Neal, David, 444
de Montmort, Pierre Remond, 411
Method of proof by contradiction, 137 Nearest neighbor, 771
Moon, John Wesley, 623, 625
Method of recursion, 211 Necessary and sufficient, 48
Moore, Edward Forrest, 333, 334, 378
Method of undetermined coefficients, Necessary condition, 48
Morash, Ronald P., 119, 120
47] Negation, 48
Moser, L., 493, 507
Methods of proof, 125 Mountain ranges, 494 Negation of quantified statements, 92.
Methodus Differentialts, 303 96. 97
px, 177
Metric, 767 Multigraph, 349, 516, 518, 533, 631 Negation (logic gate), 719
Metric space, 767 Negative, 138
Multinomial coefficient, 23
Microcontroller, 5 Negative integers, 227
Multinomial Theorem, 23, 106
Microsoft, Inc., 117, 156, 278 Multiple Nemhauser, G. L., 562, 575, 576
Miksa, F. L., 493 of an integer, 221 Nested multiplication method, 301
Millbanke, Annabella, 242 of a polynomial, 802 Network
Miller, George Abraham, 795 Multiple errors, 763 dual, 551-553
Min, 240 Multiple output network, 720 electric power, 666
Minimal covering of a graph, 577 Multiples of group elements, 748 electrical, 551, 552, 573, 574, 581, 622
Minimal disconnecting set of edges, 550 Multiplication of equivalence classes gating, 309, 719-722, 731
Minimal distinguishing string, 375 of integers (in Z,,), 687 logic, 719, 720
Minimal dominating set, 577, 730 of polynomials, 809 multiple output, 720, 721
Minimal element (of a poset), 362 Multiplication of matrices, A-14 parallel, 64
Minimal machine, 373 Multiplication of polynomials, 800 PERT, 357, 377
Minimal number of states, 327 Multiplicative cancellation in Z, 221 Program Evaluation and Review
Minimal product of sums, 727, 742 Multiplicative identity for matrices, A-16 Technique, 357, 377
Minimal realization of a finite state Multiplicative identity for real numbers, series, 65
machine, 371, 372 103 switching, 64-66
Minimal spanning tree, 639, 667, 668 Multiplicative identity in a ring, 675 transport, 644-658
Minimal spanning tree algorithms, Multiplicative inverse (of a nonzero real Network interface, 12
639-643, 668 number), 103, 278 Network number, 12
Minimal sum of products, 721-725, Multiplicative inverse in a ring, 677, 681 Neumann, Peter M., 796, 797
729-733, 742 Multiplicative inverse for a matrix, A-16 Neutrons, 486
Minimal weight edge, 640 Multiplicative rule, 168, 172 New York Times, 707
Minimization process, 337, 371-376, Multiplicity of a characteristic root, 468 Newsom, Carroll V., 119, 120, 304, 305
378, 742 Multiplicity of an edge, 518 Newton, Sir Isaac, 303
1-16 Index
Next state, 320 Officers, 819 Ore, Oystein, 561, 668, 669
Next state function, 320, 682 Ohm’s Law, 573 Organic compounds, 791-793
Nicomachus of Gerasa, 707 Ohm’s Law for electrical flow, 573 Origin (of an edge), 349, 514
Nievergelt, Jurg, 506, 508 w (output function), 320 Orlin, James B., 562, 575, 638, 643, 654,
Nilsson, Nils J., 119, 120 w(G), 578 668
Nine-times repetition code, 773; see also On the Theory of Groups, as Depending Orthogonal Latin squares, 816-818, 831
Algebraic coding theory on the Symbolic Equation 6” = 1,794 Out degree ofa vertex, 535, 588, 644
Niven, Ivan, 243, 244, 444, 445, 708 One element of a Boolean algebra, 733 Outcome, 150, 151, 154, 155, 158, 175,
No degree, 800 14,279 177, 178
Nobel Prize, 187 One-dimensional array, 254 Outgoing degree of a vertex, 535
Node, 349, 514; see also Vertex, Vertices ]-equivalence, 338 Output (from an algorithm), 253, 289
Noether, Emmy, 706, 707 ]-equivalent states (s) E; 52), 371 Output (for a finite state machine), 309,
Noise (in a binary symmetric channel), One factor, 666 319
761 One-terminal-pair-graph, 552 Output (from a gate), 720
Nonabelian group, 749 One-to-one correspondence, 279, 303, Output alphabet, 320, 321
Nonadjacent vertices, 561 370, 427, 428, 435, 526, 551, 660, Output function, 320, 682
Noncommutative operation, 590 A-23-A-27 Output string, 321, 322
Noncommutative ring, 675, 705 One-to-one function, 255-257, 279, 280, Overcounting, 19, 20, 411
Nonempty universe, 89 409, 410 Overflow error, 229
Nonequivalent configurations, 783, 784, One-unit delay machine, 329
790, 791 One’s complement, 227-229 P(G, 2), 566-568, 570
Nonequivalent seating arrangements, 784 Onto function, 260-265, 287, 288, 392, p(m,n), the number of partitions of m
Nonequivalent states, 374 411, 439, 682, 699, 739 into exactly n positive summands, 444
Non-Euclidean geometry, 820 Open contact, 551, 553 p(n), the number of partitions of n, 432,
Nonexecutable specification statement, Open interval, 99, 100, 134, 164 443
369 Open statement, &6, 87, 89-92, 105, 106, P(n,r), 7, 15, 41, 436
Nonhomogeneous recurrence relation, 109, 123, 126, 194, 195 p is sufficient for q, 48
450, 451, 456, 470-481 Open switch, 64, 551, 553 Pp logically implies g, 69
Nonlinear recurrence relations, 449 Open trail, 534 Pair of orthogonal Latin squares,
Nonnegative integers, 133 Open walk, 515, 516 816-819, 823
Nonplanar graph, 540, 541, 543 Operand, 136 Pairs of rabbits, 505
Nontaking kings, 510 Operation, 136 Pairwise disjoint subboards, 405, 408
Nontaking rooks, 404, 407 Operations research, 574, 631, 667 Pairwise incidence matrix, 826
Nontrivial subgroup, 748 Optimal prefix code, 613 Palindrome, 13, 174, 197. 319, 425, 426,
Nonzero complex numbers, 134 Optimal spanning tree, 638, 639, 642 431, 432, 460, 461, 469
Nonzero division, 221, 356 Optimal tree, 612, 613, 640-642 Palmer, Edgar M., 574, 576
Nonzero rational numbers, 133 Optimization, 41, 324, 562, 581, 631 Pan balance, 602, 603
Nonzero real numbers, 134 Or (connective), 48 Papadimitriou, Christos H., 333, 334
Nor (connective), 66 Or (exclusive), 48, 56 Parallel algorithm, 531
NOR gate, 727, 728 OR gate, 719-721 Parallel classes, 822-824, 828
Normal subgroup, 795, 831 Order, 6, 14, 15, 30, 125, 130 Parallel computer, 531
Not p, 48 Order for the vertices of a tree, 592, 593 Parallel lines, 822, 827, 828
Not, .. and (connective), 66 Order g (or, Order of g), O(g), 290-292 Parallel network, 64
Not... or (connective), 66 Order at least, 293 Parent, 588, 593, 597, 613
Null child, 594, 595 Order for a Boolean algebra, 736 Parenthesize an expression, 38, 39, 490,
Null graph, 523 Order for functions, 290, 292, 293 494
Null set (@), 127 Order in a tree, 588, 589 Parity checks, 778
Number of divisions, 458, 459 Order of a finite field, 812, 813 Parity-check code, 764, 765; see also
Number of positive divisors, 239 Order of a group, 746 Algebraic coding theory
Number theory, 29, 188, 222, 242-244, Order of a group element, 754 Parity-check equations, 770, 777, 778,
303, 304, 394, 411, 412, 432, 442, 673, Order of a linear recurrence relation, 456 see also Algebraic coding theory
705, 706 Order of quantifiers, 98 Parity-check matrix, 772, 774, 776-779;
Numerical analysis, 304 Order-preserving function, 366, 509 see also Algebraic coding theory
Ordered array, 501-503 Parker, Ernest Tilden, 819, 831
O(g) (order of g), 290 Ordered binary tree, 488 Partial breadth-first spanning tree, 656
O(g) on S, 498 Ordered pair, 152, 176, 248, 252, 253, Partial fraction decomposition, 426, 483,
Q(g), 293 282, 284 485
Object program, 253, 302 Ordered rooted tree, 588 Partial function, 260
O’Bryant, Kevin, 623, 624 Ordered set, 129, A-25 Partial order, 337, 341-343, 356-364,
Octahedron, 548 Ordered sum, 205 376, 377, 476, 533, 737, 738; see also
Octal system (base 8), 225 Ordered tree, 594 Poset
od(v), 535 Ordered triples, 827 Partial order for a Boolean algebra,
od(z), 644 Orderly permutation, 455 736-738
Odd integer, 113, 218 Ordinary generating function, 436, 440, Partial ordering relation, 357; see also
Odd-degree vertices, 53] 443, 444; see also Generating function Partial order, Poset
Index 1-17
Partial semipath, 652 Polynomial in the indeterminate x, 799 Primary key, 272
Partially ordered set, 357, 377; see also Polynomial order, 293 Prime characteristics, 812
Poset Polynomial ring, 801, 830 Prime integer (or number), 116, 193, 221,
Particular solution, 471, 475, 479, 482 Polynomial time complexity, 293 222, 230, 237, 238
Partition, 366-375, 377, 378 Pop, 490-493 Prime factorization, 238, 240
Partitions of integers, 29, 31, 432-435, Poset, 357-364 Prime order (for a group), 758
443, 444 antichain, 38] Prime polynomial, 807
Pascal, Blaise, 42, 188, 244 chain, 381 Primitive statement, 48
Pascal’s triangle, 133, 135, 188 glb (greatest lower bound), 363 Princeton University, 706
Patashnik, Oren, 304, 305, 506, 507 greatest element, 363 Principia Mathematica, 119, 187
Path (in a graph), 351, 516, 517, 556, 582 greatest lower bound (gib), 363 Principle of cross classification, 411
Path (staircase), 9, 36-38, 130, 132 Hasse diagram, 358-361 Principle of duality
Pattern, 124 lattice, 364 for a Boolean algebra, 735
Pattern inventory, 783, 789-793 least element, 363 for Boolean functions, 713
Pawlak, Zdzislaw, 623 least upper bound (lub), 363 for Boolean variables, 713
Peacock, George, 186 length of a chain, 381 for logic, 59
Peano, Giuseppe, 188, 243, 377 lower bound, 363 for set theory, 141
Peano’s postulates, 243 lub (least upper bound), 363 Principle of inclusion and exclusion, 261,
Pegs, 472, 473 maximal chain, 381 385, 389-397, 402, 407, 411, 412,
Peile, Robert E., 796 maximal element, 362 415, 659
Peirce, Charles Sanders, 119, 377 minimal element, 362 Principle of Mathematical Induction,
Pendant vertex, 533, 549, 583, 5%4 order-preserving function, 366 194-196, 198, 200-206, 213-216,
Pennies, 462, 495 topological sorting algorithm, 360, 218, 244, 315, 317, 390, 425, 441, 448,
Pentium processor, 5 361, 363 468, 469, 599: see also Mathematical
Perfect, H., 668, 669 total order, 359-361 induction, The alternative form of the
Perfect integer, 24] upper bound, 363 Principle of Mathematical Induction
Perfect matching, 666 Positive closure of a language, 315 Principle of Strong Mathematical
Perfect square, 90, 239 Positive integers, 133, 136, 193 Induction, 206
Perl, 4 Positive rational numbers, 133 Private-key cryptosystem, 693, 759, 760
Permutation, 6-8, 14, 15, 41, 42, 217, Positive real numbers, 134 Probability, 3, 42, 123, 150-189, 247,
220, 393, 394, 403, 408, 411, 436, 452, Postorder (traversal), 592-595, 623, 628 262, 402, 408, 409, 411, 428, 430, 468,
453, 490-492, 495, 506: see also Postulates, 87, 98, 243 506, 759-765
Arrangement Power series, 417, 418, 433, 443, 484 Additive rule, 162, 168, 172
Permutation group, 749, 750, 781, 782, Power set, 128, 476, 533 Axioms of probability, 159, 161
830; see also Group theory PowerBall, 15 Bayes’ Theorem, 170, 173, 188
Permutation matrix, 670 Powers of Bernoulli trial, 161, 178, 179
PERT network, 357, 377 an alphabet, 310 Binomial probability distribution, 179
Petersen, Julius Peter Christian, 574 a function, 282 Chebyshev’s Inequality, 183, 184, 188
Petersen graph, 543, 566, 574 a group element, 747 conditional probability, 166-173
Peterson, Gerald R., 742, 743 a Janguage 315 continuous sample space, 164
@ (the null set), 127 a real number, A-2 discrete sample space, 164, 175
¢() (Euler’s phi function), 394, 395, a relation, 345 E(X), 177, 182, 183
689, 692, 747, 759, 760 a ring element, 802 elementary event, 158
Pi notation, 239 z, 310 event, 151, 158, 159
II notation, 239 strings, 312 expectation, 177
Pigeonhole principle, 273-278, 287, 288, Pr(B|A), 167 expected value, 177, 179, 180
303-305, 327, 328, 796 Precedence graph, 350 experiment, 150-153
Plaintext, 690-692, 760 Precedes, 347, 348 independent events, 155, 158, 161, 170
Planar graph, 540-553, 573 Precise instructions, 233 independent outcome, 154, 166, 170,
Pianar-one-terminal-pair-graph, 552 Pred (predecessor) function, 307 174
Planarity of graphs, 352, 615; see also Prefix, 312, 313, 315, 338 Kolmogorov, Andrei, 159
Planar graph Prefix codes, 609, 611, 613, 614, 624 Law of total probability, 169, 170, 173
Platonic solids, 547, 548, 556 Prefix notation, 591 mean, 177, 180, 183
Pless, Vera, 796, 797 Pregel River, 533 jx, 177
Points at infinity, 828 Preimage (of an element), 253 Multiplicative Rule, 168, 172
Polaris submarine, 357 Preimage (of a set), 285, 286 mutually disjoint events, 159
Polish notation, 591, 592 Premise, 53, 67, 70, 107, 109-111 outcome, 180, 181, 183
Polya, George, 623, 625, 745, 796, 797 Preorder (traversal), 592-597, 616, 619, random variable, 175-184
Polya’s Method of Enumeration, 623, 620, 623 Rule of complement, 159, 172
779, 789, 891 Prescribed order, 597, 599, 620, 653, 655 sample space, 150-155
Polya’s theory in graphical enumeration, Preservation of Boolean algebra ox, 180
574 operations, 739 standard deviation, 180, 182, 183
Polyhedra, 573 Preservation of ring operations, 698 variance, 177, 180
Polynomial equation, 794 Prim, Robert Clay, 638, 641, 667, 669 Probability distribution, 176, 177, 179,
Polynomial evaluation algorithm, 301 Prim’s algorithm, 641-643, 653 180, 428, 430
1-18 Index
Probability rules and laws, 172 Push, 490-493 Rearrangement, 749
Problem of the 36 officers, 819, 831 Puzo, Mario, 186, 692 Reasoning system, 86
Le Probléme des rencontres, 411 Rebman, Kenneth R., 304, 305
Procedure for the Euclidean algorithm, Q, Q*, Q*, 133 Received word, 762, 763, 777; see also
234 Qn, 532, 542, 667 Algebraic coding theory
Procedure for integer division, 224 q is necessary for p, 48 Record, 694
Proceedings of the Royal Geographical Quadratic equation, 794 Recurrence relations, 447-510
Society, 565 Quadratic order, 293 analysis of algorithms, 473
Product of Boolean functions, 712 Quadratic time complexity, 299 associated homogeneous relation,
Product of disjoint cycles, 780, 781, 786 Quantified open statement, 87 471473, 479, 480
Product of matrices, A-14 Quantifiers, 91, 98, 103-105, 119, 125, boundary conditions, 448
Product of maxterms, 717, 718 146, 195, 291 characteristic equation, 456
Product of sets, 248 bound variable, 88 characteristic roots, 456, 468
Program Evaluation and Review connectives, 88, 89 constant coefficients, 448
Technique (PERT) network, 357, 377 existential quantifier, 87, 88 Fibonacci relation, 457
Program verification, 203 Vx, 88 first-order linear relation, 448
Projection, 270-272 tree variable, 88 general solution, 456, 468, 471
Projective plane, 827, 828 implicit, 89. 90 geometric progression, 447
Proof, 10, 47, 84, 103, 104, 119; see also dx, 88 homogeneous relation, 448, 450, 456
Rules of inference, Mathematical universal quantifier, 87, 88 initial condition, 448, 456
induction, Mathematical Quantify, 87 linear relation, 449
induction—alternative form Quantum theory, A-11 linearly independent solutions, 456,
Proof by contradiction, 76, 77, 80, 84, Quartic equation, 794 464
99, 114, 115, 127, 137, 237, 273, 291 Quasi-path, 650 Maple code, 477
Proper coloring of a graph, 565-568, 570 Quaternary alphabet, 474 method of generating functions,
Proper divisors of 0, 221 Quaternary relation, 271 482-487
Proper divisors of zero, 675, 677, 689, Quaternary sequence, 247 method of undetermined coefficients,
801 Queue, 598, 599 47]
Proper prefix, 312, 315 Quick sort, 609 nonhomogeneous relation, 471, 472
Proper subgroup, 748 Quine, Willard Van Orman, 742, 743 nonlinear relation, 487-493
Proper subset, 124-126 Quine-McCluskey method, 727, 742 particular solution, 471, 475, 479, 482,
Proper substring, 313 Quintic equation, 794, 830 487-493
Quintilianus, Marcus Fabius, 705 second-order linear relations, 456-468
Proper suffix, 312
Quotient, 221, 223, 224 system of recurrence relations, 486,
Properties of a Boolean Algebra, 735,
Quotient group, 831 487
736
Table of particular solutions for the
Properties of a group, 747
r(C, x), 404406, 408 method of undetermined
Properties of exponents, A-3
Vek (C), 405, 406 coefficients, 479
Properties of the integers, 193-246
R, R*, R*, 133,134 variable coefficients, 452
Division algorithm, 223
R[x], 799 Recurrent event, 506
Euclidean algorithm, 232, 233
R* (converse of relation RK), 282 Recursion, 211
Fundamental Theorem of Arithmetic,
Rabbits, 505 Recursive algorithm, 453
238
Radicals, 830 Recursive algorithm for the Fibonacci
greatest common divisor, 231, 232
Ralston, Anthony, 537, 575, 576 numbers, 477, 478
least common multiple, 236
Ramsey, Frank Plumpton, 305 Recursive construction, 129, 447, 532,
mathematical induction, 193-208
Ramsey theory, 305 620
primes, 193, 221, 222, 230, 237, 238 Recursive definition, 210-218, 251, 255,
Random variable, 175-184, 209, 296,
Well-Ordering Principle, 194 428, 430 282, 312, 317, 447, 594, A-1, A-26
Properties of logarithms, A-6 Random walk, 506 Recursive function, 259, 453
Proposition, 47; see alse Statement
Randomly generated numbers, 689 Recursive method, 454
Propositional calculus, 735 Range, 175, 253, 392, 393 Recursive procedure, 453, 500, 606, 608
Priifer code, 586, 587 Rank, 559, 819 Recursive process, 211-213, 218, 316
Prune (a tree), 596 Rate of a code, 764, 778; see also Recursively defined set, 218, 251, 316
Pruned tree, 611 Algebraic coding theory Reddy, M. R., 562, 575
Pseudocode procedure for Ratio test, 429 Redei, L., 559
binary search, 502 Rational number exponent, A-2 Redfield, J. Howard, 796, 797
bubbiesort, 450 Rational numbers, 133, 194, A-30 Reducible polynomial, 807
Euclidean algorithm, 234 Reachability, 338 Reductio ad Absurdum, 76, 127
exponentiation, 297 Reachable state, 330 Redundant state, 371, 373
Fibonacci numbers, 477-479 Reactor, 486 Reed, M. B., 574, 576
gcd (recursive), 455 Read, R. C., 566, 571, 574, 576, 796, 797 Refinement (of a partition), 373
hinear search, 296 Real numbers, 133, 139, 194, A-27, Reflections, 750, 781, 782, 788
modular exponentiation, 693 A-28, A-30 Reflexive property (of a relation),
Pseudorandom numbers, 689 Real-valued function, 99 337-343, 347, 348, 353, 366-369,
Public-key cryptosystem, 759, 760 Rear (of a list), 598, 599 376, 377, 782, 808
Index I-19
Regiments, 819 Reverse order, 620 Roman, Steven, 769, 796, 797, 831, 832
Region, 544 Ribet, Kenneth, 706 Rook, 404, 407
Regular graph, 531 Right branch, 488 Rook polynomial, 404—406, 408, 410,
Reinelt, G., 562, 576 Right-cancellation property (in a group), 412, 416, 510, 659
Reingold, Edward Martin, 506, 508 747 Root extraction, 794
Relation, 211, 247, 250-256, 271, 282, Right child, 590, 594, 610, 611 Root of a binary ordered tree, 488
303, 337-378, 513, 737, 780-783 Right children, 594, 595 Root ofa polynomial, 802, 804-806
antisymmetric relation, 340, 341, 347, Right coset, 757 Root of a tree, 587-590
348, 353, 357, 358, 376, 377 Right subtree, 592, 594-596 Root of multiplicity 2, 805
associative law of composition, 345 Rigid motions Rooted binary tree, 488, 594
binary relation, 250, 337 of a cube, 791 Rooted Fibonacci tree, 626
composite relation, 344 of an equilateral triangle, 749, 750 Rooted ordered binary tree, 488, 489,
converse of a relation, 282 of a regular hexagon, 788 506, 596
definition of a relation, 250 of a regular tetrahedron, 792, 793 Rooted tree, 587-596, 600, 601
divides relation, 339, 737 of a square, 750, 780 Rorres, Chris, A-21
equivalence relation, 337, 342, 343, Rinaldi, G., 562, 576 Rosen, Kenneth H., 42, 244, 704, 708
353, 366-378 Ring, 673, 674 Ross, Kenneth A., 119, 120
equivalent states, 338, 371 Ring homomorphism, 697-700, 706 Rota, Gian Carlo, 412, 444, 445
first level of reachability, 338 Ring isomorphism, 697-704 Rotating drum, 536
irreflexive relation, 344 Ring of matrices, 674, 705 Rotations, 10, 749, 781, 782, 788, 791,
k-equivalent states, 338, 371, 372 Ring of polynomials, 799, 801 792
modulo n relation, 337 Ring theory, 673-709 Rothman, Tony, 831, 832
1-equivalence, 338 Boolean ring, 709 Rothschild, Bruce L., 305
partial order, 337, 341-343 cancellation laws of addition, 680 Roulette, 163, 164
partial ordering relation, 341, 357 cancellation law of multiplication, Round-robin tournament, 559
poset, 357-364 678, 681 Rouvray, Dennis H., 574, 576
powers of a relation, 345 center of a ring, 709 Row major implementation, 254
reachability, 338 characteristic, 812 Row matrix, A-11
reflexive relation, 337-343, 366-369 commutative ring, 675 Row number, 716
relation composition, 344 congruence modulo n, 689, 690 Row vector, A-11
relation matrix, 346-349 definition, 673, 674 Royal flush, 152
second level of reachability, 338 field, 677, 678, 681, 682 RSA Cryptosystem, 759-761, 795
subset relation, 250, 340, 358, 359 group of units, 747 Ruin problems, 506
symmetric relation, 339-343 homomorphism, 697-700, 706 Rule for Proof by Cases, 78
transitive relation, 339-343 ideal, 684, 700, 706 Rule of Complement, 159, 172
zero-one matrix, 344, 347 idempotent element, 697 Rule of Conditional Proof, 78
Relation composition, 344 integers modulo x, 686-696 Rule of Conjunction, 75, 78
Relation matrix, 346-349 integral domain, 677, 678, 681, 682 Rule of Conjunctive Simplification, 78,
Relational data base, 271, 272, 305 isomorphic rings, 698, 699, 704 94, 137
Relative complement, 138 isomorphism, 698 Rule of the Constructive Dilemma, 78
Relative frequency, 158, 159 kernel of a homomorphism, 704 Rule of Contradiction, 76, 78
Relatively prime integers, 232-234, 236, matrix rings, 674, 705 Rule of the Destructive Dilemma, 78
240, 394, 470 multiplicative identity, 675 Rule of Detachment, 70, 71, 78, 108
Relatively prime polynomials, 808 multiplicative inverse, 677 Rule of Disjunctive Amplification, 78,
Relativity theory, 707 proper divisors of zero, 675 137
Remainder, 223-226, 234, 274, 276, 686, ring of matrices, 674, 705 Rule of Disjunctive Syllogism, 75, 78
689, 693, 804, 805, 810 ring of polynomials, 799, 801 Rule of Existential Generalization, 117
Remainder Theorem, 804, 805 ring properties, 679-684 Rule of Existential Specification, 117
Rencontre, 403 ring with unity, 675 Rule of Product, 4-7, 11, 14-19, 28, 29,
Rényi, A., 587 subring, 682-684 34, 125, 142, 197, 239, 248, 255, 256,
Repeated real roots, 467 subtraction, 680 261, 274, 339, 341, 342, 403, 567
Repetition, 7, 26, 27, 41, 125, 149 unit, 677 Rule of Sum, 3-5, 16, 19, 34, 125, 132,
Replacement, 92, 96, 106, 114, 115, 124 unity, 675 148, 262, 264, 274
Replication number, 825 Zn. 686 Rule of Universal Generalization,
Representation Theorem for a finite Ring with unity, 675, 801 110-114, 126
Boolean algebra, 738-741, 743 Ringel, Gerhard, 573, 576 Rule of Universal Specification,
Resek, Diane, 119, 120 Rings of Saturn, 42 106-113, 126
Reset, 321 Rinnooy Kan, A. H. G., 562, 575, 576 Rules of Inference, 70-78, 83, 84, 86,
Residue arithmetic, 704 Riordan, John, 412, 444, 445 107-109, 112, 113, 117, 119
Resolution, 86, 119 Rise/fall permutation, 495, 496 Law of the syllogism, 72, 73, 78
Resolvent, 86 Rivest, Ronald L., 504, 507, 624, 625, Modus ponens, 70, 73, 75, 78
Restriction of a function, 257 638, 643, 654, 668, 759 Modus tollens, 73, 76, 78
Retransmission, 765, 769 Roberts, Fred S., 42, 574, 576 Proof by (the method of)
Reversal function, 318 Robertson, N., 575, 576 contradiction, 76, 80, 84
Reversal of a string, 317, 319 Robinson, J, A., 86 Reductio ad Absurdum, 76
1-20 Index
Rule for proof by cases, 78 Self-complementary graph, 529, 576 Seven Bridges of Kénigsberg, 378, 513,
Rule of conditional proof, 78 Self-dual, 735 518, 533-535, 573
Rule of conjunction, 75, 78 Self-dual Boolean function, 744 Seyffarth, Karen, 412
Rule of conjunctive simplification, 78 Self-orthogonal Latin square, 820 Seymour, P. D., 575, 576
Rule of detachment, 70, 78 Semicircles, 40 Shamir, Adi, 759
Rule of disjunctive amplification, 78 Semipath, 650-653 Shannon, Claude Elwood, 741-743, 761,
Rule of disjunctive syllogism, 75, 78 Sentences, 47, 48, 310 795, 797
Rule of the constructive dilemma, 78 Separation, 645 Sherbert, Donald R., 506, 508
Rule of the destructive dilemma, 78 Separation property, 646 Shier, Douglas R., 623, 625, 668, 669
Rule of universal generalization, Sequence, 255 Shift, 690
110-114, 126 Sequence of pseudorandom numbers, Shift cipher, 691
Rule of universal specification, 689, 690 Shi-kie, Chu, 188
106-113, 126 Sequence recognizer, 326, 327, 332 Shimura, Goro, 706
Table of rules of inference, 78 Sequential circuit, 309; see also Finite Shmoys, D. B., 562, 575, 576
Rules for negating quantified statements, state machine Shortest-Path algorithm, 667
96 Serial binary adder, 323, 324 Shrikhande, S. S., 819, 831
Run, 33, 34, 192, 482 Series network, 65 Shushu jiuzhang, 707
Running time, 452 Seshu, S., 574, 576 Siblings, 588, 593, 612
Russell, Lord Bertrand Arthur William, Set braces, 123, 124 Sichuan, 707
119, 135, 187 Set equality, 125, 126, 252, 314 Sieve method, 41]
Russell’s paradox, 135, 186, 187 Set of all possible outcomes, 150 ox (standard deviation), 180, 182, 183
Rydell High School, 16 Set of indices, 145 oF (variance), 180
Ryser, Herbert John, 42, 412, 668, 669, Set theory, 87, 119, 123-129, 133-155, r?, Z, 2", 309
$31, 832 211, 247, 303, 304, 309, 311, 313 xt, x*, 310
cardinality, 124, 186, A-23 Sigma notation, 17, 239
s(m, n), 267 complement of a set, 138 = notation, 17
S3, Sq, 750 countable set, 164, 303, A-24-A-32 Signals, 438, 439, 761
Sn, 787, 789, 794, 830 denumerable set, 303, A-24 Signed numbers, 681, 705
S(m, n), 263 disjoint sets, 137, 148 Silvestri, Richard, 795, 797
S(x, k), 767 element, 123 Simple group, 795
Saaty, Thomas L., 668 element argument, 126, 137, 140, 144 Simultaneous solution of a system of
Sahni, Sartaj, 641, 642, 668, 669 empty set (4), 127 congruences, 702
Same cardinality (for sets), A-23 equality of sets, 125, 126 Singleton subset, 128
Same likelihood, 150, 158 finite set, 124, 125, 186, A-23, A-24 Sink
Same size (for sets), A-23 generalized intersection of sets, 146 in a finite state machine, 331
Sample space, 150-155, 157-159, generalized union of sets, 146 in a transport network, 631
161-164, 166-172, 175-181, 183, infinite set, 124, 186, 189, A-23-A-26, Sink state, 331
262, 296, 402, 409, 428 A-28, A-30 Size of a set, 124, A-23
Samuel, Pierre, 707, 708, 831, 832 intersection of sets, 136, 138 Sloane, Neil James Alexander, 795, 797
Sanders, D. P., 575, 576 intuitive definition of a set, 123 Smallest element, 194
Sandler, R., 831 laws of set theory, 139, 144 Smith, Henry John Stephen, 243
Saturated edge, 645, 649, 650 member, 123 Snowflake curve, 475
Saturated hydrocarbons, 573, 581, 584 membership, 126 Software development, 203
Saturn, 42 membership table, 143, 144, 146 Soifer, Alexander, 304, 305
Scalar product, A-12—A-14 mutually disjoint sets, 137, 148 Solow, Daniel, 119, 120
Scattering function, 694, 708 null set (B), 127 Solution of polynomial equations, 830
Schedule, 815, 825 power set, 128 Solvable group, 830
Scholtz, Robert A., 796 Principle of Duality, 141 Sorting, 506, 581, 605, 606, 608
Schréder, Ernst, 119, 377 proper subset, 124-126 Sorting technique, 450
Schréder numbers, 495 relative complement, 138 Source in a network, 631
Schwenk, Allen J., 628 set braces, 123, 124 Source of a directed edge, 349, 514
Scientific American, 575 set of indices, 145 Source program, 253, 302
Searching (algorithm), 501 singleton subset, 128 Space (blank), 310
Searching process, 295, 501 size of a set, 124, A-23 Space complexity function, 290
Second level of reachability, 338 subset, 124-128, 130-132, 138, 140, Spanning forest, 582
Second-order homogeneous recurrence 141, 149 Spanning subgraph, 521, 582, 640
relations, 456-468 superset, 138 Spanning tree, 582, 596, 597, 599, 631,
Security, 693 symmetric difference, 136 638, 640
Seed, 689 uncountable set, 303, A-28 Specification statement, 369
Sefer Yetzirah (The Book of Creation}, 41 union of sets, 136, 138 Spencer, Joel H., 305
Selection, 14-16, 19-22, 26; see also universe, 123-128 Sphere S(x, k), 767; see also Algebraic
Combination universe of discourse, 123-128 coding theory
Selection structure, 51 Venn diagram, 141-144, 146, 148, 155 Spine (of a caterpillar), 627, 628
Selection with repetition, 26-29, 32, 415, well-ordered set, 194 Split input (for a gating network), 720
423, 485 Set theory of strings, 309 Spokes (in the wheel graph), 519, 520
Index 1-21
Square matrix, A-1] length of a string, 18, 310-312 Sun (R) Microsystems, Inc., 5
Square of a graph, 626 palindrome, 319 Superimposed, 815
Stabilizer, 785 powers of strings, 312 Superset, 138
Stack, 490-492, 507, 605 prefix, 312, 313, 315 Suppes, Patrick C., 189
Staircase paths, 9, 130, 132 proper prefix, 312, 315 Surjective function, 260
Stanat, Donald F., 333, 334, 378, 507, proper substring, 313 Switches (in a network), 64-66, 551
508 proper suffix, 312 Switches in series, 65
Standard deviation, 180, 182, 183 reversal, 317, 319 Switching circuits, 742
Standard form of a Latin square, 816, 817 substring, 313, 315, 328, 338 Switching function, 711, 712, 719, 742
Stanley, Richard Peter, 444, 507, 508 suffix, 312, 313, 315 Switching network, 64-66
Star of David, 475 Strongly connected component, 352 Sylow, Ludwig, 795
Starting state, 320, 324, 329 Strongly connected directed graph, 351, Sylvester, James Joseph, 411, A-11
State diagram, 321, 324, 327 539 Symbolic logic, 118
State table, 321, 322, 324, 331, 372-375 Strongly connected machine, 331 Symbolic Logic, 188
State transition, 320, 321 Structured programming, 203 Symmetric Boolean function, 744
Statement, 47-121, 127, 140, 311 Subboard, 404, 405, 408, 409 Symmetric difference, 136, 313
compound statement, 48, 49 Subfield, 809, 811, 812 Symmetric group (S,,), 787, 789, 794,
contradiction, 53, 58 Subgraph, 521, 523, 525, 582, 588 830
contrapositive, 62, 63, 92-94 Subgraph induced by a set of vertices, Symmetric property (of a relation),
converse, 62, 63, 82, 99 522 339-343, 347, 348, 353, 366-369,
definitions, 52, 87, 98, 103-105, 113 Subgroup, 748, 749, 756-758 376, 377
dual (of a) statement, 59, 62 Subgroup generated by a group element, Syndrome, 771, 775-777, 779; see also
if-then decision structure, 51 754 Algebraic coding theory
if-then-else decision structure, 51 Sublist, 450, 606, 607 Syndrome decoding, 779
inverse, 62, 63, 82, 92-94, 99 Submachine, 331, 682 System of congruences, 702, 707, 708
logical equivalence, 56-61, 83, 95 Subring, 682-684, 699, 702 System of distinct representatives, 663,
logical implication, 69-73, 75, 89, 91, Subsections of strings, 312 668
95 Subsequence, A-26 System of linear equations, A-18, A-19
logically equivalent statements, 56, 58, Subset, 124-128, 130-132, 138, 140, System of recurrence relations, 486, 487
61-64, 74, 91, 97, 98 141, 149 Systematic form, 778; see also Algebraic
logically implies, 69, 92 Subset relation, 250, 358, 359, 362, 363, coding theory
negation, 48 737 Szekeres, George, 276
negation of quantified statements, 92, Subsets with no consecutive integers, 457
96, 97 Substitution rules (in logic), 60-62, 69, t, (the nth triangular number), 198
open statement, 86, 87, 89-92, 105, 71, 72, 76, 80 To (tautology), 53
106, 109, 123, 126, 194, 195 Substring, 313, 315, 328, 338 ©, 294
primitive statement, 48 Subtraction, 137, 224, 225, 227, 228, 356 Table for a relational data base, 271
quantified statement, 87 Subtraction (in a ring), 680 Table for decoding, 774-776; see also
tautology, 53, 58, 59, 69 Subtree, 488, 583, 588, 590, 593-596, Algebraic coding theory
theorem, 105, 106, 110, 112, 119 602 Table of Big-Oh forms, 293
truth tables, 49 Succ (successor) function, 307 Table of identities for generating
Statistics, 3, 33, 175, 188, 815 Success, 161, 178, 179, 182 functions, 424
Stein, Clifford, 504, 507, 624, 625, 638, Successor, 243, 307 Table of particular solutions for the
643, 654, 668 Such that, 124 method of undetermined coefficients,
Steiner triple system, 829 Sufficient condition, 48 479
Steinhaus, Hugo Dynoizy, 506, 508 Suffix, 312, 313, 315, 338 Table of rules for negating statements
Stern, R. G., 562, 575, 576 Suffix function, 318 with one quantifier, 96
Stifel, Michel, 42 Sum of atoms, 738 Table of rules of inference, 78
Stillwell, John, 794, 797, 831, 832 Sum of bits, 720, 721 Table of Stirling numbers of the second
Stinson, Douglas R., 831 Sum of Boolean functions, 712 kind, 264
Stirling, James, 303 Sum of a geometric series, 476 Tabular form, 71
Stirling numbers of the first kind, 267 Sum of matrices, A-12 Tabulation algorithm, 742
Stirling numbers of the second kind, 29, Sum of minterms, 717 Tallahassee, 17
260, 263-265, 303, 304, 370, 508, 587 Sum of squares, 200 Taniyama, Yutaka, 706
Stirling’s formula, 304 Sum of the weights of the edges, 631 Tarry, G., 819
Stoll, Robert R., 119, 120 Summation, 17, 18 Tartaglia, Niccolo, 188
Storage circuits, 5 index, 17 Taubes, G., 795, 797
Strang, Gilbert, A-21 lower limit, 17 Tautology, 53, 58-61, 67, 69, 71, 76, 113
Street, Anne Penfold, 796, 797, 831, 832 upper limit, 17 Taylor, Richard, 706
String, 12, 18, 19, 128, 129, 309-323, Summation formulas, 32, 33, 35, 47, 196, Telephone communication system, 320
328, 337, 338, 609, 610, 761 197, 200, 259, 430, 441, 470 Terminal (in a switching network), 64
concatenation, 311, 312 Summation notation, 17, 18 Terminal vertex, 588
empty string, 310, 323 Summation operator, 440, 441 Terminals, 552
equality of strings, 311 Summations, 292 Terminating vertex, 349, 514
A (the empty string), 310, 323 Sumo wrestlers, 277 Terminus, 349, 514
1-22 Index
Ternary operation, 306 backtrack, 653, 656 complement of a subgraph, 586
Ternary strings, 469 backward edge, 650, 651, 654 complete binary tree, 589, 595, 596,
Tetrahedron, 547, 548, 792 capacity, 644 600, 605
Thatcher, Margaret, 74 capacity for a vertex, 657 complete binary tree for a set of
Theorem, 53, 67, 70, 84, 87, 98, 99, 105, capacity of a cut, 646, 665 weights, 612
106, 110 ,112, 113, 117, 119, 193, 222 capacity of an edge, 644, 645, 650, 654 complete m-ary tree, 600-602
Théorie Analytique des Probabilités, 443 c(P, P), 646, 648, 652, 654 complete ternary tree, 603
Theory of equations, 411] chain, 650 decision tree, 602, 603
Theory of graphs, see Graph theory conservation condition, 645, 651 definition, 581
Theory of groups, see Group theory cut, 645-648, 652, 661, 662 depth-first search, 597, 598, 600, 617,
Theory of languages, 18, 332, 337 definition, 644 624
Theory of matrices, 411 Edmonds-Karp algorithm, 653-657 depth-first search algorithm, 597, 598,
Theory of numbers, see Number theory f-augmenting path, 650-654, 656, 663 617
Theory of rings, see Ring theory flow in a network, 644-654 depth-first spanning tree, 615-620
Theory of sets, see Set theory Ford-Fulkerson algorithm, 654-657 descendants, 588, 616-619
Theory of types, 187 forward edge, 650, 651, 654, 655 dfi(v), 616, 619-621
Sx, 88 Max-Flow Min-Cut Theorem, 649, dictionary order, 589
There exists an x such that, 88 652 directed tree, 587
Therefore (.".), 71 maximal flow, 645, 647 Fibonacci tree, 626
Third Reich, 333 network, 644 forest, 581, 639, 641, 642
Third-order linear homogeneous quasi-path, 650 full binary tree, 611
recurrence relations with constant saturated edge, 645, 649, 650 full m-ary tree, 614
coefficients, 463, 464 semipath, 650-653 graceful tree, 627
Thomas, R., 575, 576 sink, 644-646, 648, 653 grandparent, 593
Thompson, John, 795 source, 644-646, 648, 653 height, 601-603, 611
Tile, 470 tolerance, 650 Huffman tree, 613, 614
Tiling, 464 unsaturated edge, 645 inorder (traversal), 594, 595
Time complexity function, 290, 297-299, usable edge, 653, 655, 656 internal vertices, 588, 591, 593, 601,
450, 452, 496, 498, 500, 501, val( f ), 645-648 612
605-609, 624; see also Computational value of a flow, 645-649, 651-653 Kruskal’s algorithm, 639-641
complexity Transpose of a matrix, 348 labeled complete binary tree, 610
Time complexity function for the bubble Transposition of a Ferrers graph, 435 labeled tree, 586
sort, 450-452 Trappe, Wade, 693, 708, 795, 797 leaf, 588
Time complexity function for the merge Traveling Salesman Problem, 562, 574 left child, 590, 594, 610, 611
sort, 607-609 Treatise on Algebra, 186 left subtree, 590, 592, 594-596
Top (of a stack), 490 Tree, 250, 488, 489, 573, 581-629, 641, level, 588, 589, 593, 597, 607, 611
Top-down approach, 41 642, 653, 655, 656, 796; see also level number, 588, 601, 602, 612
Topological sorting, 359, 377 Graph theory lexicographic order, 589
Topological sorting algorithm, 360, 361, algorithm for articulation points, 619, m-ary tree, 600
363 620 merge sort algorithm, 496, 608
Total order, 359-361, 377 algorithm for constructing a Huffman minimal spanning tree, 639, 667, 668
Totally ordered poset, 359 tree, 613 null child, 594, 595
Tournament, 559, 602 algorithm for counting labeled trees, optimal spanning tree, 638, 639, 642
Towers of Hanoi, 472, 505 586, 587 optimal tree, 612, 613, 640-642
Tolerance, 650 algorithm for the universal address order for the vertices of a tree, 588,
Trail, 516, 517, 528 system, 589 589, 592-595
Transfer sequence, 331 ancestors, 588, 616-619 ordered binary tree, 488
Transfinite cardinal number, 303 articulation point, 615-621, 624 ordered rooted tree, 588
Transform, 253 back edge, 616-619, 621 parent, 588, 593, 597, 613, 619-621
Transformation, 36, 37 backtrack(ing), 593, 596-598, 600, pendant vertex, 583, 584
Transient state, 330 616 postorder (traversal), 592-595
Transition, 320 balanced tree, 601, 602 prefix code, 609, 611, 613, 614, 624
Transition sequence, 331 biconnected component, 615, preorder (traversal), 592-596
Transition state, 321 619-621, 624 Prim’s algorithm, 641-643, 653
Transition table, 321 binary rooted tree, 589, 590, 594, 595 quick sort, 609
Transitive property (of a relation), binary tree, 488, 595, 600 right child, 590, 594, 610, 611
339-343, 347, 348, 353, 357, 358, branch nodes, 588 right subtree, 592, 594-596, 614
366-368, 376, 377 branches, 488, 614 root, 587-590
Transmission errors, 762, 767 breadth-first search, 598-600 rooted Fibonacci tree, 626
Transmission of digital signals, 188 breadth-first search algorithm, 598, rooted tree, 587-596, 600, 601
Transmitter, 767, 769 599 sibling, 588, 593, 612
Transport network, 324, 644-658, breadth-first spanning tree, 599 sorting, 581, 605, 606, 608
660-663, 665, 667, 668 caterpillar, 627, 628 spanning forest, 582
a-z cut, 645 characteristic sequence, 625 spanning tree, 582, 596, 597, 599, 631,
associated undirected graph, 645, 650 child, 588, 590, 594, 598, 617-620 638, 640
Index 1-23
spine (of a caterpillar), 627, 628 Uniqueness of inverses Vertices of a graph, 349, 514
subtrees, 583, 588, 590, 593-596, 602 for a group, 747 adjacent vertices, 349, 514
terminal vertex, 588 for a ring, 680, 681 isolated vertex, 349, 514
universal address system, 589 Unit circle, A-28 origin, 349, 514
W(T), 612 Unit delay machine, see One-unit delay source, 349, 514
weight of a tree, 612 machine terminating vertex, 349, 514
weights for an optimal tree, 612 Unit in a ring, 677, 681, 689, 700 terminus, 349, 514
Tree diagram, 154, 157, 248-250, 331, Unit-interval graph, 520 Video-display terminal, 155
488 Unity in a ring, 675, 681, 700 Von Dyck, Walther Franz Anton, 794
Tree traversal, 594 Unity of a Boolean algebra, 733, 739 Von Ettinghausen, Andreas, 42
Tremblay, Jean-Paul, 704, 708 Universal address system, 589 Von Koch, Helge, 475
Trend, 33 Universal generalization, 110, 111 Von Neumann, John, 689
Trial, 179 Universal quantifier, 87, 88, 90, 96, 98, Von Staudt, Karl, 622
Triangle inequality, 767 124 Vorlesungen iiber die Algebra der Logik,
Triangular number, 193, 198, 482, 572 Universally quantified statement, 107, 119
Triangulation (of a convex polygon), 494 108, 110-112
Trigonometric series, 303 Universal set, 523 W,,, 520, 572
Triple, 248 Universal specification, 106, 111 W(T), 612
Triple repetition code, 765, 768, 769; see Universe, 87, 90-92, 106, 123-128, 138, Wakerly, John F,, 742, 743
also Algebraic coding theory 139, 149, 161 Walk, 515, 516
Triple system, 829 Universe of discourse, 87, 124 Walker, Elbert A., 707, 708
Trivial subgroup, 748 Unsaturated edge, 645 Wallis, W. D., 796, 797, 831, 832
Trivial walk, 515 Unspecified outputs, 731 Walser, Hans, 506, 508
Trotter, H. F., 506, 508 Upper bound, 363 Wand, Mitchell, 244
Tune (truncation) function, 254 Upper limit Washington, Lawrence C., 693, 708, 795,
Truth tables, 49, 52, 55-59, 62, 70, 143 in product notation, 239 797
Truth value, 48, 49, 69, 82 ofa summation, 17 Weaver, W., 797
Uranium, 486 Weight of an edge, 631, 638, 644
T-shaped figure, 121
Tucker, Alan, 42, 43, 412, 444, 445, 796, U.S. Navy, 357, 377 Weight of a string, 18
Weight of a tree, 612
797 U.S.S. Constitution, 623
Weight ofx (in coding theory), 766; see
Turing, Alan Mathison, 333 Usable edge, 653, 655, 656
also Algebraic coding theory
Turing machine, 333 User-interface, 155
Weighted directed graph, 644
Tutte, W. T., 573 Utility graph, 542
Weighted graph, 631-634, 636, 637,
Two-byte address, 5
640-642, 667
Two-dimensional array, 101 (v, b, r, k, A)-design, 825, 826, 831 Weights (for an optimal tree), 611, 612
Two-dimensional motions, 749 Vajda, S., 506, 508 Well-defined binary operation, 687
2-isomorphic graphs, 555 Val( f), 645-649, 651-653, 656 Well-defined (in set theory), 123
2-methyl propane, 584 Valid argument, 47, 53, 67-71, 111; see Well-formed formulae, 220
Two-state device, 711 also Proof Well-ordered set, 194
Two-unit delay machine, 329 Validity of an argument, 70, 71, 73, 76, Well-Ordering Principle, 193, 194, 222,
Two-valued logic, 711 77, 79-83, 99, 103, 109, 112 223, 231, 236
Two’s complement method, 227, 228, Value of a flow, 645-649 West, Douglas B., 543, 573, 574, 576
230 Van Gelder, Allen, 305, 624, 625, 641, Weston, J. Harley, 412
Tymoczko, Thomas, 575, 576 642, 667, 668 What the Tortoise Said to Achilles, 119
Van Slyke, R., 642, 668, 669 Wheel graph, 519, 520, 572
Uliman, Jeffrey David, 333, 334, 378, Var(X), 180-184 Wheel of fortune, 196
506, 507, 574, 575, 623, 624, 642, Variable, 86-88 Wheel with n spokes, 520, 572
667, 668, 708 bound, 88 Whitehead, Alfred North, 119, 187
UltraSPARC processor, 5 free, 88 Whitney, Hassler, 573
Unary operation, 138, 267, 268, 733 Variable coefficient, 452, 487 Whitworth, William Allen, 42, 43, 411,
Uncountable set, 164, 303, A-28, A-29, Variance, 177, 180 412
A-32 Varieties, 825-827 Wilder, Raymond L., 119, 120, 304, 305
Undirected edge, 351 VDT, 155 Wiles, Andrew John, 705, 706
Undirected graph, 350-352, 396, 480, Veblen, O., 831 Wilf, Herbert S., 444, 445
488, 514, 515, 615-619, 639-642, Vector space, 624 Wilson, John, 752
699, 730 Vectors, 694 Wilson, Robin J., 574, 575
Uniform discrete random variable, 185, Veitch, E. W., 742, 743 Wilson’s Theorem, 752, 798
209 Velleman, Daniel J., 304, 305 Wimbledon, 249, 601
Union Construct (in C++), 369 Venn, John, 141, 188 Without replacement, 15
Union of graphs, 570 Venn diagram, 141-144, 146, 148, 155, Woltz, Jack, 186
Union of sets, 136, 138, 213, 248, A-29, 161, 168, 169, 188, 385, 386, 393, Wood, Derick, 333, 334
A-31 398, 411 Word, 310
Uniqueness of complements (inverses) Vertex degree, 530 World War II, 333
for a Boolean algebra, 736 Vertex set, 349, 514 Worst-case complexity, 295, 296
1-24 Index
Worst-case time-complexity function, wt (x) (for a string x), 18 Zero element of a Boolean algebra, 733,
503, 605-609, 636, 637, 640-642, Wyman, M., 493, 507 738, 739
654, 668 Zero element of a ring, 674, 679, 699,
Wrapped around, 536, 724 Xenocrates (of Chalcedon), 41 701
Wright, Charles R. B., 119, 120 Zero-one matrix, 247, 344, 345, 347,
Wright, Edward Maitland, 244, 412 Youse, Bevan K., 244 348, 352
wt(a, b), 631 (0, 1)-matrix, 345, 347, 348, 352, 378;
wt(e) (for an edge e), 631, 632, 638, Z, Z*, 133, 134 see also Zero-one matrix
639, 641 Z,,, 134, 686 Zero polynomial, 802
wt (x) (in coding theory), 766; see also Zariski, Oscar, 707, 708, $31, 832 Zuckerman, Herbert Samuel, 243, 244,
Algebraic coding theory Zero element, A-13 444 445,708
FORMULAS
n! n factorial: O! = I; nm! =n(n — 1)--- 3)(2)(1),n € Zt
P(n,r) the number of permutations of n objects taken r at a time,
O<r<n.[P(n,r) =nl/(n—-r)!]
C(n, r) = (”) the number of combinations or selections of n objects taken
ratatime,O<r<n.[C(n,r) =na!/[ri(n —r)!]]
trot) the number of combinations or selections of 1 objects taken
r ata time, with Tepetitions allowed (r > 0)
The Binomial Theorem: = (x + y)” = (§)x°y" + (t)xty? 1 4--- + (R)x"y®
(Y= OF(E mere
S(m, n) = (1/n!) Y-o(—1)*(,,",)(@ — &)”, a Stirling number of the second kind.
S(m, n) is the number of ways to distribute m distinct objects among n identical
containers with no container left empty.
Gs), rent
f(x) = ay + a,x + anx? +.a3x3 +--+: f(x) is the (ordinary) generating function
for the sequence ap, a1, a2, @3,...
Fora éR,m,ne Zt
(1+ xy" = (6) + (i)x + (x?
+ + Gx"
(1 +ax)"
= (5) + (ax
+ G)arx? +--+ (a™x"
(1+ xy = 6) + (Da + (Be bot (at
(—-x" )/(-—x)=ltx+t+x?+---+x"
1/1 —x)=1l+x4+2?4+23 =x
YA -ax" = (oC) + GOO +G Non (S')ay+
a \(-—x)i = Sot x!
i=0 i=0
g(x) = ap +.a,(x/1!) + ag(x?/2!) + a3(x3/3!) +--+: g(x) is the exponential
generating function for the sequence ag, a1, a2, 43, ...
2 3
=l+x4 424...
“ 2" 3i
1 _ x2 x4 1 ey)
Ge +e alt Dt Ge (Je —e yextatate
the n-th Fibonacci number: Fo = 0, F; = 1; and
Fy = Fy-1
+ Fr-2,n > 2
b,,n>0 the n-th Catalan number: b, = (-H) C"),n>0
NOTATION
SPECIAL SETS OF the set of integers: {0, 1, -—1, 2, —2, 3, —3,...}
NUMBERS the set of nonnegative integers or natural numbers:
{0, 1,2, 3,...}
the set of positive integers: {1, 2,3, ...} = {x € Z|x > 0}
the set of rational numbers: {a/bla, b € Z, b # 0}
the set of positive rational numbers
the set of nonzero rational numbers
the set of real numbers
the set of positive real numbers
the set of nonzero real numbers
the set of complex numbers: {x + yilx, y € R, i? = —1}
the set of nonzero complex numbers
{0,1,2,...,2—1}, forne Zt
the closed interval from a to b: {x € Rla < x < b}
the open interval from a to b: {x € R\a < x < b}
a half-open interval from a to b: {x € Rla < x < b}
a half-open interval from a to b: {x € Ria < x < b}
ALGEBRAIC R is a ring with binary operations + and -
STRUCTURES the ring of polynomials over ring R
the degree of the polynomial f(x)
G is a group under the binary operation o
the symmetric group on n symbols
a left coset of subgroup H (in group G): {ah|h € H}
(B, +, "y a” 0, 1) the Boolean algebra & with binary operations + and -, the unary
operation —, and identity elements 0 (for +) and 1 (for +)
GRAPH THEORY G = (V, E) G is a graph with vertex set V and edge set E
Kn the complete graph on n vertices
G the complement of graph G
deg(v) the degree of vertex v (in an undirected graph G)
od(v) the out degree of vertex v (in a directed graph G)
id(v) the in degree of vertex v (in a directed graph G)
k(G) the number of connected components of graph G
On the n-dimensional hypercube: the n-cube
Kinn the complete bipartite graph on V = V, U V2 where
ViN V2 = 8, [Vil =m, |V2| = 71
B(G) the independence number of G
x(G) the chromatic number of G
P(G, A) the chromatic polynomial of G
y(G) the domination number of G
L(G) the line graph of G
T =(V, E) T is a tree with vertex set V and edge set E
N =(V, E) N is a (transport) network with vertex set V and edge set E
Download more eBooks here: htto://avaxhome.cc/blogs/ChrisRedfield